Search

GPT-5.5 Instant new ChatGPT default, Grok 4.3 on xAI API, Anthropic x Blackstone enterprise

GPT-5.5 Instant new ChatGPT default, Grok 4.3 on xAI API, Anthropic x Blackstone enterprise

ai-powered-markdown-translator

Article translated from fr to en with gpt-5.4-mini.

View the project on GitHub โ†—

Busy week: OpenAI pushes GPT-5.5 Instant to directly replace GPT-5.3 for all ChatGPT users, xAI launches Grok 4.3 on its API with a one-million-token context, and Anthropic announces a new enterprise AI services company co-founded with Blackstone, Hellman & Friedman, and Goldman Sachs. On the tooling side, GitHub triples security GA/previews via its MCP server, Perplexity launches a product dedicated to financial teams, and Runway presents real-time video agents generated from a single image.


GPT-5.5 Instant โ€” new default ChatGPT model

May 5 โ€” OpenAI replaces GPT-5.3 Instant with GPT-5.5 Instant as the default ChatGPT model for all users. The rollout spans two days.

DimensionImprovement vs GPT-5.3 Instant
Hallucinations (medicine, law, finance)-52.5%
Incorrect claims (reported by users)-37.3%
Response verbosity-30.2% words on average

The model also improves image analysis, STEM answers, and the decision to use web search. Responses are more concise without losing substance, with less unnecessary formatting and fewer irrelevant follow-up questions.

Memory sources โ€” OpenAI introduces โ€œmemory sourcesโ€ across all ChatGPT models: when a response is personalized from saved memories, past conversations, or connected Gmail, the user sees exactly which sources were used and can correct or delete them. Personalization from past conversations and files is initially reserved for Plus and Pro subscribers (web), with other plans to follow.

Availability:

  • Gradual rollout over 2 days for all ChatGPT users
  • Available via API under the alias chat-latest
  • GPT-5.3 Instant remains accessible for 3 months for paying subscribers

๐Ÿ”— Official GPT-5.5 Instant announcement


Grok 4.3 launched on the xAI API โ€” 1M token context, #1 agentic tool calling

May 5 โ€” xAI announces via X the launch of Grok 4.3 on the xAI API (console.x.ai). The model is presented as the fastest and smartest in the lineup to date.

FeatureValue
Context window1 million tokens
Agentic tool calling benchmark#1 (@ArtificialAnlys leaderboard)
Instruction following benchmark#1 (@ArtificialAnlys leaderboard)
Enterprise domains#1 case law and corporate finance (@ValsAI)
AvailabilityxAI API (console.x.ai) โ€” not yet on grok.com

Grok 4.3 is now live on the xAI API. Itโ€™s our fastest, most intelligent model to date. It tops the @ArtificialAnlys leaderboards in agentic tool calling and instruction following, and ranks #1 in @ValsAI enterprise domains like case law and corporate finance. Grok 4.3 supports a 1 million token context. โ€” @xai on X

The tweet generated 25.7 million views and 6,029 likes. Note: no dedicated page on x.ai/news at the time of the announcement โ€” the launch went exclusively through X.


Anthropic and Blackstone, Hellman & Friedman, Goldman Sachs โ€” new enterprise AI services company

May 4 โ€” Anthropic, Blackstone, Hellman & Friedman, and Goldman Sachs announce the creation of a new enterprise AI services company, backed by a consortium of additional alternative investors.

The goal: deploy Claude in the core operations of large companies for tasks that require intensive engineering and deep industry knowledge. According to Anthropic, enterprise demand for Claude exceeds what a single distribution model can absorb.

The typical operating model starts with a small team working closely with the client to identify friction points, then building Claude agents tailored to the business. The concrete example given: a multi-site network of medical practices where Claude handles clinical documentation, repetitive administrative tasks, and coordination between specialties, allowing clinicians to focus on patient care.

The new company will join the Claude Partner Network, alongside Accenture, Deloitte, and PwC. It represents a structural step in Anthropicโ€™s enterprise distribution strategy: rather than selling only API licenses, the company is now engaging in complex operational deployments with top-tier financial partners.

๐Ÿ”— Official announcement


Claude agents for financial services and insurance

May 5 โ€” Anthropic launches ten ready-to-run agent templates for financial services and insurance. Available as plugins in Claude Cowork or Claude Code, or as standalone Claude Managed Agents on the Claude platform.

Research and client coverage:

AgentRole
Pitch builderTarget lists, comparables, pitchbooks
Meeting preparerClient and counterparty briefs
Earnings reviewerTranscript reading and model updates
Model builderFinancial model creation from filings and data
Market researcherSector monitoring and news synthesis

Finance and operations:

AgentRole
Valuation reviewerValuation checks
General ledger reconcilerAccounting reconciliation and NAV calculations
Month-end closerMonthly close and accounting entries
Statement auditorFinancial statement review
KYC screenerEntity file assembly and compliance screening

Claude now integrates into Microsoft Excel, PowerPoint, Word, and Outlook (in progress) via add-ins. Claude Coworkโ€™s Dispatch feature lets you assign tasks by text or voice from anywhere.

New data connectors: Dun & Bradstreet, Fiscal AI, Financial Modeling Prep, Guidepoint, IBISWorld, SS&C IntraLinks, Third Bridge, Verisk, and a Moodyโ€™s MCP (ratings and data on more than 6,000 entities).

Among the clients cited: Citadel, FIS, BNY, Carlyle, Mizuho, Travelers, Walleye Capital (100% of employees use Claude Code), Hg, Morningstar, FactSet. These agents are optimized for Claude Opus 4.7, ranked #1 on the Vals AI Finance Agent benchmark.

๐Ÿ”— Official announcement


Perplexity Computer for Professional Finance

May 5 โ€” Perplexity launches Computer for professional finance, a version of Computer designed specifically for analysis and investment teams: buy-side and sell-side analysts, hedge funds, private equity.

DimensionValue
Included workflows35 (10 segments)
Integrated data providers14 (including Quartr, Fiscal)
Premium MCP connectorsMorningstar, PitchBook, Daloopa, Carbon Arc
Available platformsMicrosoft Teams, Agent API
Coming soonExcel add-in
FinSearchComp T1 benchmark1st (accuracy, cost, latency)

Teams with licensed subscriptions can connect their own credentials via MCP connectors to access Morningstar, PitchBook, Daloopa, and Carbon Arc. Others get access to built-in financial tools powered by 14 data providers.

Every numerical value links back to its source: for values from SEC documents, Computer shows the calculation and points to the exact pages in the document. On the FinSearchComp T1 benchmark (time-sensitive data extraction), Perplexity ranks first in accuracy, cost per correct answer, and latency โ€” including real-time prices, crypto prices, and exchange rates.

๐Ÿ”— Perplexity blog โ€” Computer for Professional Finance


Runway Characters โ€” real-time video agent from a single image

May 4 โ€” Runway announces Characters, a technology that turns a single image into a real-time conversational video agent.

MetricValue
End-to-end latency1.75 seconds
Video quality24 fps HD
Required image source1 image only
Cold starts60ร— faster (peer-to-peer GPU)

The 1.75-second delay is measured from the moment the user stops speaking to the characterโ€™s first response. Runway simultaneously published two engineering papers: the first describes the real-time video agent architecture, the second explains how peer-to-peer GPU infrastructure reduces cold-start times by 60.

Target use cases include conversational agents, interactive real-time characters, and video interfaces for applications. The technology marks a shift from offline video rendering to synchronous interaction.

๐Ÿ”— Runway Characters announcement tweet


GitHub MCP Server โ€” Triple security advance

May 5 โ€” GitHub simultaneously publishes three security updates for its MCP server, all on the same day.

Secret scanning GA

Secret scanning via GitHub MCP Server moves to general availability (out of preview since March 2026). In GitHub Copilot CLI, installation is done with /plugin install advanced-security@copilot-plugins; in VS Code, the advanced-security plugin exposes the /secret-scanning command.

AspectDetail
StatusGA (general availability)
AvailabilityRepositories with GitHub Secret Protection enabled
IntegrationsCopilot CLI, VS Code, any MCP-compatible IDE

MCP tools now respect existing push protection customizations โ€” bypass behavior is consistent with the repository or organization configuration.

๐Ÿ”— Changelog โ€” Secret scanning GA

Dependency scanning in public preview

Dependency vulnerability detection via MCP Server moves to public preview. The system queries the GitHub Advisory Database and returns structured results with affected packages, severity, and recommended fixed versions.

AspectDetail
StatusPublic preview
AvailabilityRepositories with Dependabot alerts enabled
CLI activationcopilot --add-github-mcp-toolset dependabot

๐Ÿ”— Changelog โ€” Dependency scanning

GitHub Advanced Security ร— Microsoft Defender for Cloud GA

The GitHub Advanced Security ร— Microsoft Defender for Cloud integration also moves to GA. It correlates container images deployed in cloud environments with GitHub source code, bringing runtime context into security views.

New filters available in the organization view: has:deployment, runtime-risk:internet-exposed, runtime-risk:sensitive-data. Security campaigns can be assigned directly to the GitHub Copilot coding agent.

๐Ÿ”— Changelog โ€” Code-to-cloud GA


Model Spec Midtraining (MSM) โ€” agentic alignment reduced from 68% to 5%

May 5 โ€” Anthropic researchers publish โ€œModel Spec Midtrainingโ€ (MSM), an alignment method inserted between pretraining and alignment fine-tuning (AFT).

The principle: models are trained on a synthetic corpus of documents discussing the content of their Model Spec before learning to follow its rules. The idea is that understanding why a rule exists improves the robustness of its application.

ModelMisalignment (AFT only)With MSM + AFT
Qwen2.5-32B68%5%
Qwen3-32B54%7%

MSM also makes AFT much more data-efficient: 40 to 60 times less AFT data is needed to achieve comparable performance. The authors also show that explaining the motivations behind the rules (rather than multiplying sub-rules) improves out-of-distribution generalization.

๐Ÿ”— MSM paper โ€” alignment.anthropic.com


NotebookLM Mind Maps โ€” customization, organization, navigation

May 5 โ€” NotebookLM improves its Mind Maps with three features rolled out simultaneously.

FeatureDescription
CustomizationGuide the map with specific user instructions
OrganizationRename and share Mind Map charts instantly
NavigationSmooth transitions between nodes

The rollout is gradual for all users. The update completes NotebookLMโ€™s sequence of rapid improvements since early April: automatic source organization (April 24, 100% rollout reached on May 5), integration into the Gemini mobile app (April 30).

๐Ÿ”— NotebookLM tweet


Genspark sb-git โ€” Git server rewritten for AI agents

May 5 โ€” Genspark launches sb-git, a Git server rewritten from scratch for AI agents. Complete Git semantics: versioning, branches, diff, blame, rollback, and push.

AspectDetail
CLIgsk (init, clone-url, cat, commit)
CompatibilityClaude Code, OpenClaw, any Git agent
Storage1 GB (free), 10 GB (Plus/Pro)
Account requiredNo โ€” no GitHub account needed
AvailabilityImmediate (web + mobile)

No GitHub account required, no prior repository setup. The focus is on compatibility with common AI agents (Claude Code, OpenClaw) without installation friction.

๐Ÿ”— Genspark sb-git tweet


NVIDIA + ServiceNow โ€” Project Arc, autonomous long-running desktop agent

May 5 โ€” At the ServiceNow Knowledge 2026 conference, Jensen Huang and Bill McDermott announced the expansion of their partnership around autonomous AI agents in the enterprise.

ServiceNow launches Project Arc, an autonomous long-running desktop agent designed for knowledge workers: developers, IT teams, administrators. The agent uses NVIDIA OpenShell (open source sandbox) for governance and security, and connects natively to the ServiceNow platform via ServiceNow Action Fabric.

MetricValue
Blackwell vs Hopper efficiency50ร— tokens/watt
Cost reduction per million tokens~35ร—
Nemotron 3 Super (open source)#1 EnterpriseOps-Gym (NOWAI-Bench)
Tickets resolved autonomously90% (ServiceNow + Apriel/Nemotron)

๐Ÿ”— NVIDIA Blog โ€” ServiceNow


NVIDIA NemoClaw + OpenClaw โ€” persistent open source agent surpassing React on GitHub

April 30 โ€” OpenClaw (created by Peter Steinberger) surpassed 250,000 GitHub stars in 60 days, overtaking React to become the platformโ€™s most-starred project. NVIDIA is collaborating with the community to secure this persistent self-hosted AI agent project.

NVIDIA is launching NemoClaw, a one-command-installable reference implementation, combining OpenClaw + NVIDIA OpenShell + Nemotron with security-hardening configurations enabled by default.

MetricValue
OpenClaw GitHub stars250,000+ (March 2026)
Growth#1 GitHub project in 60 days (surpasses React)
Agent inference multiplier vs reasoning AI1,000ร—
NemoClaw installation1 command

๐Ÿ”— NVIDIA Blog โ€” OpenClaw/NemoClaw


Luma AI Uni-1.1 API โ€” image generation that reasons about creative briefs

May 5 โ€” Luma AI is launching the Uni-1.1 API, an image generation model designed to reason about creative briefs rather than tokens. Unlike traditional APIs that require prompt engineering, Uni-1.1 understands the aesthetic context of each visual tradition and produces usable results on the first try.

Use cases cited: fashion tools, architectural renders, manga pipelines, cinematic content. No middleware required. The API is available at lumalabs.ai/api.

๐Ÿ”— Luma AI Uni-1.1 Tweet


ChatGPT Ads Manager self-serve and CPC bidding

May 5 โ€” OpenAI is expanding its advertising program with two new features: a self-serve tool (Ads Manager, in beta in the US) and the launch of CPC bidding mode (cost per click).

ModeStatusDescription
CPM (cost per thousand impressions)ExistingAvailable since the program launched
CPC (cost per click)NewThe advertiser pays only when a click actually happens
Ads Manager self-serve (beta)NewAvailable to US advertisers

Agency partners: Dentsu, Omnicom, Publicis, WPP. Technology partners: Adobe, Criteo, Kargo, Pacvue, StackAdapt. OpenAI also launched a Conversions API and pixel tracking to measure post-click actions without exposing individual conversations to advertisers.

๐Ÿ”— OpenAI advertising announcement


Perplexity Premium Health Sources

May 5 โ€” Perplexity launches Premium Health Sources. More than one in ten queries on the platform are about health. The sources available at launch are NEJM, BMJ Journals, and BMJ Best Practice โ€” medical references usually reserved for institutional subscriptions.

In Computer, these sources activate automatically for health questions without manual selection. Each answer includes traceable citations. Upcoming sources: Micromedex, EBSCOhost, Health Affairs, VisualDx, American Academy of Orthopaedic Surgeons, American Diabetes Association, Springer Publishing.

๐Ÿ”— Perplexity Blog โ€” Premium Health Sources


Briefs

  • Manus โ€” Automatic connector recommendation โ€” Manus now detects which connector (Slack, Notion, Gmail, Google Drive) is needed to complete a task and recommends it in the conversation, without leaving the thread. Activation still requires user confirmation. ๐Ÿ”— source

  • Black Forest Labs โ€” FLUX Creator Program โ€” BFL is opening a selective creator program for early access to upcoming FLUX models, with amplification of their work through BFL channels. ๐Ÿ”— source

  • GPT-5.5 Instant System Card โ€” First System Card in the Instant lineup classified as โ€œHigh capabilityโ€ in OpenAIโ€™s Preparedness Framework categories of Cybersecurity and Biology & Chemistry. Additional safeguards have been implemented accordingly. ๐Ÿ”— source

  • OpenAI โ€” WebRTC relay+transceiver architecture โ€” OpenAI publishes an engineering article describing the redesign of its WebRTC infrastructure for real-time voice (ChatGPT Voice, Realtime API), serving more than 900 million weekly users. The architecture separates packet routing (lightweight, stateless relay) from protocol termination (transceiver, stateful), enabling standard Kubernetes deployment with a reduced public UDP footprint. ๐Ÿ”— source


What this means

Finance as the primary playground for enterprise AI. Within 24 hours, Anthropic, Perplexity, and xAI each published announcements explicitly targeting financial teams: ten Claude agent templates (valuation, KYC, month-end close), Computer for Professional Finance with 35 workflows and 14 data providers, and Grok 4.3 ranked #1 on the Vals AI benchmark in corporate finance and case law. The convergence is no accident โ€” finance combines structured document volume, precision requirements, and tolerance for premium tool costs, making it the ideal ground for the first high-value autonomous agent deployments.

The race for default models. GPT-5.5 Instant reduces hallucinations by 52.5% compared with its immediate predecessor, and Grok 4.3 reaches a one-million-token context with measured, published agentic performance. These two models launch on the same day. The goal is no longer just to publish the best academic benchmarks, but to be the model loaded by default in consumer interfaces (ChatGPT) or activated first in developer pipelines (xAI API).

MCP as a developer security standard. GitHub simultaneously released three security updates through its MCP server (secret scanning GA, dependency scanning in preview, code-to-cloud GA). This coordinated rollout turns GitHubโ€™s MCP server into a native security integration channel for coding agents โ€” Copilot CLI, VS Code, and any MCP-compatible IDE can now scan for secrets and vulnerable dependencies before every commit, directly in the agent workflow.

Persistent agents and real-time infrastructure. Runway Characters (a video agent with 1.75 s latency from an image), ServiceNowโ€™s Project Arc (a long-running desktop agent), OpenClaw/NemoClaw (250,000 GitHub stars, 1,000ร— more inference demand than reasoning AI), and Genspark sb-git (Git rewritten for agents) all signal the same shift: AI agents are moving out of the one-off query era and into that of persistent processes, with radically different infrastructure needs โ€” state storage, real-time latency, native versioning.


Sources