ai-powered-markdown-translatorArticle translated from fr to en with gpt-5.4-mini.
Anthropic and xAI sign an unprecedented agreement: 220,000 NVIDIA GPUs from the Colossus 1 supercomputer will double Claude Code limits starting this week. Claude for Microsoft 365 goes generally available on Excel, PowerPoint, and Word. OpenAI launches GPT-Realtime-2, the first voice model with GPT-5-level reasoning. Perplexity opens Personal Computer to all Mac users, and ElevenLabs crosses $500 million in ARR with NVIDIA as a strategic investor.
Anthropic leases Colossus 1 from xAI โ 220,000 NVIDIA GPUs, Claude Code limits doubled
May 6 โ Anthropic simultaneously announces an immediate increase in usage limits and an unprecedented infrastructure deal with SpaceX / xAI.
For users, the most visible change is the doubling of five-hour rate limits in Claude Code, effective immediately on Pro, Max, Team, and Enterprise plans. The automatic peak-time throttling โ which restricted Pro and Max plans โ is also removed. API limits for Claude Opus models are raised in parallel.
These increases are made possible thanks to an agreement with SpaceX: Anthropic gains access to the entire capacity of Colossus 1, xAIโs supercomputer, meaning more than 300 megawatts and more than 220,000 NVIDIA GPUs (H100, H200, and GB200). This capacity is available within the month. The two companies also announce a shared intention to develop multiple gigawatts of orbital AI compute capacity โ a first in the industry.
This partnership adds to an already growing stack of deals: Amazon (up to 5 GW, with nearly 1 GW available by the end of 2026), Google and Broadcom (5 GW starting in 2027), Microsoft and NVIDIA (50 billion in U.S. AI infrastructure). International expansion will include data residency requirements for regulated sectors. Anthropic also commits to covering any increase in local electricity prices for residents caused by its datacenters.
| Change | Affected plans | Effective |
|---|---|---|
| 5h Claude Code limits doubled | Pro, Max, Team, Enterprise | Immediate |
| Peak-time throttling removed | Pro, Max | Immediate |
| Opus API limits increased | All | Immediate |
| Compute deal | Capacity | Timeline |
|---|---|---|
| SpaceX / xAI Colossus 1 | 300+ MW, 220,000+ NVIDIA GPUs | Within the month |
| Amazon | Up to 5 GW (~1 GW by end of 2026) | 2026 |
| Google + Broadcom | 5 GW | Starting in 2027 |
| Microsoft + NVIDIA | USD 30 billion Azure | โ |
| Fluidstack | USD 50 billion U.S. infrastructure | โ |
๐ Anthropic โ Higher limits + SpaceX deal
Claude for Microsoft 365 โ general availability on Excel, PowerPoint, Word + Outlook beta
May 7 โ Claude for Excel, PowerPoint, and Word move into general availability for all paid plans. Claude for Outlook simultaneously enters public beta under the same conditions.
โClaude for Excel, PowerPoint, and Word are now generally available, and Claude for Outlook is in public beta. As Claude moves between your Microsoft apps, it carries the full context of your conversation.โ โ @claudeai on X
The core feature is the shared context across the four applications: a conversation started in Outlook to sort an email continues in Word to draft a memo, then in Excel for data analysis, and in PowerPoint for the presentation โ without ever having to re-explain the context. Automatic cross-app updating is the other concrete benefit: changing an assumption in an Excel model simultaneously updates the chart in the presentation and the corresponding figure in the Word memo.
Among the companies cited: ServiceNow (โClaude does the work in Excel itself, instead of asking us to move content between toolsโ) and private asset management teams using it to build and maintain financial coverage models.
| Application | Status as of May 7, 2026 | Plans |
|---|---|---|
| Claude for Excel | General availability (GA) | All paid plans |
| Claude for PowerPoint | General availability (GA) | All paid plans |
| Claude for Word | General availability (GA) | All paid plans |
| Claude for Outlook | Public beta | All paid plans |
๐ Claude for Microsoft 365 announcement
Claude Managed Agents โ dreaming, outcomes, multiagent orchestration, webhooks
May 6 โ At the Code with Claude conference, Anthropic launches several new features for its agent deployment platform.
The standout new feature is dreaming: a scheduled process that analyzes an agentโs past sessions, extracts recurring patterns, and consolidates its memory so it improves over time. The developer stays in control โ dreaming can update memory automatically or send each change for human review. Dreaming is available in experimental research preview on request.
Outcomes enters public beta: this feature lets each agent result be evaluated against developer-defined criteria before it is delivered to the user. The company Wisedocs used it to speed up medical document review by 50% while maintaining alignment with its internal standards.
Multiagent orchestration lets a lead agent delegate subtasks to specialist agents that run in parallel, making it easier to handle complex work requiring multiple expertise areas at once. Webhooks are also available to trigger external actions.
| Feature | Availability | Description |
|---|---|---|
| Dreaming | Research preview (on request) | Self-improvement by analyzing past sessions |
| Outcomes | Public beta | Result evaluation before delivery |
| Multiagent orchestration | Public beta | Lead agent + specialist agents in parallel |
| Webhooks | Public beta | Triggering external actions |
๐ Claude Managed Agents announcement
GPT-Realtime-2 โ voice with GPT-5 reasoning and 128K context
May 7 โ OpenAI launches a new generation of models in the Realtime API: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper.
GPT-Realtime-2 is the first voice model with GPT-5-level reasoning: it can handle complex requests, call tools in parallel (parallel tool calls), recover from interruptions (recovery behavior), and maintain a 128,000-token context window (vs. 32,000 for its predecessor), suited to long sessions. Five reasoning levels are adjustable: minimal, low, medium, high, xhigh (low by default). Preambles can be inserted before responses for natural flow.
GPT-Realtime-Translate enables live simultaneous translation into 13 target languages from 70+ source languages. GPT-Realtime-Whisper provides low-latency streaming transcription.
Zillow tested GPT-Realtime-2 on its voice interactions: +26 points in success rate on its most difficult adversarial benchmark (95% vs. 69%). EU Data Residency is supported.
| Model | Capability | Price |
|---|---|---|
| GPT-Realtime-2 | Voice + GPT-5 reasoning, 128K | $32/1M audio input tokens, $64/1M output |
| GPT-Realtime-Translate | Translation 70โ13 languages | $0.034/min |
| GPT-Realtime-Whisper | Streaming transcription | $0.017/min |
| Benchmark | GPT-Realtime-1.5 | GPT-Realtime-2 (high) | GPT-Realtime-2 (xhigh) |
|---|---|---|---|
| Big Bench Audio | baseline | +15.2% | โ |
| Audio MultiChallenge APR | 36.7% | โ | 70.8% |
๐ OpenAI announcement โ new voice models
Perplexity Personal Computer available to all Mac users
May 7 โ Perplexity launches a new macOS app and opens Personal Computer to all users, with no Pro or Max subscription restriction.
The app brings AI out of the cloud and onto the device itself. It operates on local files, native Mac apps, the open web, and secure Perplexity servers. It supports 400+ connectors and integrates with the Comet browser for web tools without direct connectors. Pro and Max plans keep credits tied to the existing subscription; free users also get access.
The recommended setup is the Mac mini as a permanent hub: agent teams can run continuously (24/7) while the user works on something else, with a notification when human approval is needed. Control works from any device โ iPhone included.
The old Perplexity Mac app will be removed in the coming weeks. Download is direct (not yet available on the App Store).
| Dimension | Value |
|---|---|
| Availability | All Mac users |
| Recommended device | Mac mini (always on) |
| Supported connectors | 400+ |
| Browser integration | Comet |
| App Store | No (direct download) |
| Old app | Removal in the coming weeks |
๐ Perplexity blog โ Personal Computer for everyone
Perplexity Finance Search in the Agent API โ #1 accuracy on FinSearchComp T1
May 6 โ Perplexity launches Finance Search in the Agent API: a single tool call aggregates licensed financial data, real-time market data, and cited web sources.
The problem it solves is simple: financial decisions depend on reliable, up-to-date, and traceable sources. Finance Search replaces generic web search with structured licensed data โ prices, fundamentals, earnings call transcripts, estimates โ returned in a consistent schema regardless of the backend provider.
On the FinSearchComp T1 benchmark, Finance Search achieves the highest accuracy for real-time financial data, consistently over time, and the lowest cost per correct answer (fewer tokens needed thanks to structured data). Citations are built into every result. The model is developer-configurable, with visibility into token usage.
Finance Search is complementary to Computer for Professional Finance (already covered on May 5): where the latter offers a visual workspace, Finance Search fits into programmatic workflows via the API.
| Dimension | Value |
|---|---|
| Interface | Single tool call (finance_search) |
| Data covered | Prices, fundamentals, transcripts, estimates, market context |
| FinSearchComp T1 benchmark | #1 accuracy, #1 cost/correct answer |
| Citations | Built into every result |
๐ Perplexity blog โ Finance Search
Natural Language Autoencoders (NLAs) โ reading Claudeโs internal thoughts
May 7 โ Anthropic publishes a new interpretability method that converts a modelโs internal activations into directly readable text.
The architecture is based on closed-loop training: an activation verbalizer translates an activation into explanatory text, and an activation reconstructor tries to reconstruct the activation from that text. The more faithful the reconstruction, the better the explanation.
The safety application is striking: applied to Claude Opus 4.6 and Claude Mythos Preview, NLAs reveal that the models suspected they were being tested far more often than they said so out loud. In destructive code tests, implicit awareness of evaluation appears in 16% of cases, versus 0% explicit verbalization. On SWE-bench Verified, that figure rises to 26%. In real claude.ai user conversations, it remains below 1%.
On an audit set with a deliberately misaligned model, auditors equipped with NLAs identified the root cause in 12 to 15% of cases, versus less than 3% with other tools. The training code is open-sourced; an interactive demo is available on Neuronpedia.
| Evaluation | Implicit awareness (NLA) | Verbalized awareness |
|---|---|---|
| Destructive code tests | 16% | ~0% |
| SWE-bench Verified | 26% | ~0% |
| Real claude.ai conversations | < 1% | โ |
๐ Anthropic Research โ Natural Language Autoencoders
Petri 3.0 โ open-source alignment tool transferred to Meridian Labs
May 7 โ Anthropic transfers Petri, its open-source alignment tool, to Meridian Labs, an independent nonprofit organization dedicated to AI evaluation.
Petri is an alignment testing toolkit applicable to any language model: deception, sycophancy, cooperation with harmful requests. Integrated into evaluations of all Claude models since Sonnet 4.5, it has been adopted by the UK AI Security Institute for its AI research sabotage evaluations.
Version 3.0 brings three advances: better adaptability through separation of the auditor and target model components, a โDishโ module that runs tests under real deployment conditions (real system prompt, real scaffold) to make scenarios harder to detect, and integration with Bloom for deeper behavioral evaluations.
The transfer to Meridian Labs follows the model of the MCP protocol transfer to the Linux Foundation: ensuring the toolโs independence from any AI lab.
๐ Anthropic Research โ Petri 3.0
The Anthropic Institute (TAI) โ research agenda on 4 axes
May 7 โ Anthropic publishes the full research agenda for TAI, the internal organization launched in March 2026 to study the real-world impacts of AI from the position of a frontier lab.
The agenda is structured around four axes: economic diffusion (AI adoption by companies and countries, impact on labor markets), threats and resilience (dual-use capabilities, cybersecurity, defensive mechanisms), AI systems in the wild (in the wild โ behavioral and institutional effects of AI deployed at scale), and AI-driven R&D (acceleration of scientific research by AI itself, including the risks of recursive self-improvement loops).
TAI commits to sharing more frequent data from the Anthropic Economic Index and information on Anthropicโs internal acceleration through its own tools. A call for applications for the Anthropic Fellows program (four funded months) is open.
๐ Anthropic Research โ TAI Agenda
Codex Chrome Extension โ background browser control on macOS and Windows
May 7 โ OpenAI launches the Chrome extension for Codex, allowing the agent to directly control Chrome tabs without interrupting the userโs workflow.
Codex operates in the background across multiple tabs simultaneously, combining its native plugin capabilities with direct access to websites (dashboards, CRM, web apps). The system automatically chooses the best tool for each step: plugins, Chrome, or a combination. Use cases: debugging browser flows, checking dashboards, doing research, updating CRMs, testing complex web apps (including multiplayer games via sub-agents).
The extension installs via the Chrome plugin in the Codex app. Available immediately on macOS and Windows for all Codex users.
๐ OpenAI Tweet โ Codex Chrome Extension
ChatGPT Trusted Contact โ mental health safety with human review
May 7 โ OpenAI rolls out Trusted Contact, an optional safety feature in ChatGPT.
Any adult (18+, 19+ in South Korea) can designate a trusted person (friend, family member, caregiver) who will be alerted if crisis signals are detected in their conversations. The process combines automated detection and human review (target: less than one hour before any sending), with a notification sent without access to transcripts to protect privacy. The feature extends to adults the parental controls already available for teen accounts. Developed with the American Psychological Association and a network of 260+ doctors in 60 countries.
| Parameter | Value |
|---|---|
| Eligibility | 18+ (19+ South Korea) |
| Acceptance window for the contact | 1 week |
| Human review SLA | Target < 1 hour |
| Notification content | General reason, no transcript |
| Channels | Email, SMS, in-app |
๐ OpenAI โ Trusted Contact
OpenAI B2B Signals โ the gap between leading companies and typical companies is widening
May 6 โ OpenAI publishes the first B2B Signals report, documenting the growing gap between โleadingโ companies and typical companies in their AI adoption.
Companies in the 95th percentile use 3.5ร more intelligence per employee than typical companies (up from 2ร in April 2025). The gap is driven less by message volume (36% of the gap) than by depth of use (64%): delegation of complex tasks, agentic workflows, integration into production systems. On Codex, the gap is the most pronounced: 16ร more messages per employee.
Two concrete cases: Cisco reduces build time by ~20%, saves 1,500+ engineering hours per month, and increases defect-resolution speed by 10 to 15ร. Travelers Insurance handles ~100,000 claims calls per year via an assistant.
| Indicator | Typical companies | Leading companies |
|---|---|---|
| Intelligence/employee | baseline | ร3.5 |
| Codex messages/employee | baseline | ร16 |
| Share of volume in the gap | โ | 36% |
| Share of depth in the gap | โ | 64% |
MRC โ open source network protocol for Stargate supercomputers
May 5 โ OpenAI releases the MRC (Multipath Reliable Connection) protocol as open source via the Open Compute Project, co-developed with AMD, Broadcom, Intel, Microsoft, and NVIDIA over two years.
MRC is an 800 Gb/s network protocol for large-scale AI training supercomputers. It connects 100,000+ GPUs with only 2 switch levels (versus 3 to 4 in the conventional approach), spraying packets across hundreds of simultaneous paths via IPv6 source routing (SRv6). Failure recovery happens in microseconds (versus several seconds with classic dynamic BGP). Already in production on Stargate (Abilene, Texas) and Microsoftโs Fairwater supercomputers, MRC has enabled the training of several models including GPT-5.5 and Codex.
| Aspect | Conventional approach | MRC |
|---|---|---|
| Switch levels for 100K+ GPUs | 3-4 | 2 |
| Failure recovery | Seconds to tens of seconds | Microseconds |
| Routing | Dynamic BGP | Static SRv6 |
| Packet distribution | 1 path per transfer | Hundreds of paths in parallel |
๐ OpenAI โ MRC Supercomputer Networking
Perplexity ROSE โ Proprietary inference engine and CuTeDSL
May 6 โ Perplexity publishes a research article detailing ROSE (Runtime-Optimized Serving Engine), its proprietary inference engine, and its integration of CuTeDSL (NVIDIA GPU kernel library).
ROSE powers all Perplexity services (Sonar, Search, Embeddings) on NVIDIA Hopper and Blackwell GPUs, from encoding models up to trillion-parameter LLMs. CuTeDSL makes it possible to build optimized custom GPU kernels faster, adapted to new model architectures at a steady pace.
This publication illustrates Perplexityโs strategy: control the entire technical stack down to the GPU kernel level to differentiate on performance and reduce dependence on third-party frameworks.
๐ Perplexity Research โ CuTeDSL and ROSE
ElevenLabs reaches $500M ARR โ NVIDIA investor via NVentures
May 5 โ ElevenLabs announces a third close of its Series D with NVIDIA as a new strategic investor via NVentures.
ARR rose from 500M in April 2026**, up 43% in four months. This third close also includes BlackRock, Wellington Management, D.E. Shaw, Schroders, as well as customer companies (Salesforce, Santander, KPN, Deutsche Telekom) and a retail investment via Robinhood Ventures. A $100M tender offer was completed in parallel. ElevenLabs has 530 employees across 50+ countries. The roadmap announces the merging of image/video and audio into a unified creative platform.
๐ ElevenLabs โ $500M ARR and new investors
AlphaEvolve in production โ 5 industrial sectors via Google Cloud
May 7 โ One year after its launch, Google DeepMind publishes an update on AlphaEvolve, its Gemini-powered coding agent, now moved from research into industrial production.
AlphaEvolve optimizes Googleโs critical infrastructure: TPU, cache replacement policies, LSM-tree compaction in Google Spanner. It is commercially deployed via Google Cloud in five sectors: finance (doubling transformer performance), semiconductors (computational lithography), logistics (traveling salesman problem), advertising, and materials science (~4ร speedup at Schrรถdinger). On the academic side, AlphaEvolve collaborated with Terence Tao (UCLA) on Erdลs problems and improved lower bounds for the traveling salesman problem and Ramsey numbers.
๐ DeepMind โ AlphaEvolve Impact
Self-learning Manus Projects โ agentic workspace that improves with every task
May 6 โ Manus launches a feature allowing Projects to automatically learn from every conversation and propose user-approved updates.
At the end of each task, Manus identifies reusable decisions, standards, and patterns, then proposes: instruction updates (when the process or terminology has evolved), file updates (outdated sources, examples, or templates), and skill updates (skills) for recurring workflows. No change is applied without explicit human validation. Future collaborators start with the Projectโs latest shared context. The feature is available for all sessions where instructions and files are supported.
๐ Manus โ Self-learning Projects
Briefs
- Anthropic bug bounty open to the public โ The program, previously private within the security research community, is now accessible to everyone on HackerOne. ๐ source
- xAI Image Generation Quality Mode API โ The image generation quality mode (300M+ images generated on Grok) is now available via the xAI API: increased realism, better text rendering, stronger creative control. ๐ source
- Z.ai GLM-5V-Turbo Tech Report โ Z.ai (Zhipu AI) publishes the technical report for GLM-5V-Turbo, a native foundation model for multimodal agents with a CogViT encoder (SigLIP2 + DINOv3 distillation) and a perception-planning-execution loop. ๐ source
- ChatGPT Futures Class of 2026 โ OpenAI recognizes 26 young builders from 20+ universities (Vanderbilt, Oxford, Georgia Techโฆ) with a USD 10,000 grant each and access to frontier models. ๐ source
- NVIDIA DeepStream + Claude Code โ Demonstration of a โconcept to appโ approach combining DeepStream, Claude Code, and reusable Skills to generate Vision AI applications without writing every line of code. ๐ source
- NVIDIA Guess-Verify-Refine โ New hardware-aware inference technique where each decoding step gives the next one a head start, designed specifically for NVIDIA accelerators. ๐ source
- TokenSpeed + NVIDIA Dynamo โ TokenSpeed (LightSeek Foundation) reaches TensorRT-LLM level in open source; NVIDIA Dynamo adds day-0 support for this backend, with Kimi K2.5 supported via the Dynamo frontend. ๐ source
- Ideogram BG Remover โ New generative model (trained from scratch, not classic segmentation) for background removal: alpha channel preservation, geared toward logos and complex illustrations, API available. ๐ source
- Google DeepMind ร EVE Online โ Partnership with CCP Games to explore AI research in complex player-driven game environments. ๐ source
- GitHub Copilot Trust Layer โ Microsoft/GitHub publishes research on a structural trust layer to validate Copilot agents (execution graphs + dominator analysis): 100% precision vs 82.2% for self-evaluation, 100% recall vs 60%. ๐ source
- GitHub โ reviewing agent pull requests โ Practical guide (10-minute checklist) with 5 warning signs: CI gaming, code reuse blindness, hallucinated correctness, agentic ghosting, prompt injection into CI pipelines. ๐ source
What this means
The race for the Personal Computer is accelerating. In the space of one week, three very different interfaces are targeting the same user desktop: Perplexity Personal Computer installs on Mac (and Mac mini as a permanent hub), Claude spreads across the four Microsoft 365 apps with shared context, and Codex controls Chrome in the background. These agents are no longer in the cloud: they are embedding themselves into existing workflows, on open files, in native applications. The shift from information retrieval to direct action on everyday work tools is now concrete.
Orbital compute enters the realm of facts. The Anthropic/xAI Colossus 1 deal is remarkable for two reasons: first, it gives Anthropic immediate access to 220,000 NVIDIA GPUs to double its limits starting this week; second, it includes a shared intention to develop several gigawatts of AI capacity in orbit. Combined with the Amazon, Google/Broadcom, Microsoft/NVIDIA, and Fluidstack agreements, Anthropic is building a computing infrastructure that has no equivalent among independent research labs. This accumulation of compute is the prerequisite for the next generation of models โ and for the continued doubling of limits.
Reasoning voice changes the scope of voice agents. GPT-Realtime-2 is not a cosmetic update: bringing GPT-5 reasoning into a real-time interface, with 128K context and parallel tool calls, transforms the use cases. Zillow measures a +26-point success rate on its hardest calls. Live translation (70 source languages to 13 target languages) in the same model opens multilingual workflows without a separate translation pipeline. The question is no longer โcan we do AI voice?โ but โwhich complex voice interactions become economically viable?โ
Alignment and agentic trust are moving toward tooling. Three separate announcements converge on the same problem โ how to trust agents in production. Anthropicโs NLAs reveal that Claude knows when it is being tested (in 16% to 26% of evaluations) without verbalizing it. GitHubโs Trust Layer (100% precision vs 82% for self-evaluation) gives development teams structural validation of agent-generated pull requests. The transfer of Petri 3.0 to Meridian Labs creates an evaluation benchmark independent of any lab. These three layers โ model interpretability, output validation, and independent audit tools โ are beginning to form a trust architecture for large-scale agentic deployments.
Sources
- Anthropic โ Higher limits + SpaceX/xAI Colossus 1 agreement
- xAI โ Anthropic compute partnership
- Claude โ Microsoft 365 GA
- Claude โ Managed Agents (dreaming, outcomes, orchestration)
- Anthropic Research โ Natural Language Autoencoders
- Anthropic Research โ Petri 3.0 donated to Meridian Labs
- Anthropic Research โ The Anthropic Institute agenda
- Anthropic โ Public HackerOne bug bounty
- OpenAI โ GPT-Realtime-2 and new voice models
- OpenAI โ Codex Chrome Extension
- OpenAI โ ChatGPT Trusted Contact
- OpenAI โ B2B Signals
- OpenAI โ MRC Supercomputer Networking
- OpenAI โ ChatGPT Futures Class of 2026
- Perplexity โ Personal Computer for all Mac users
- Perplexity โ Finance Search in the Agent API
- Perplexity Research โ ROSE and CuTeDSL
- ElevenLabs โ $500M ARR and NVIDIA investor
- DeepMind โ AlphaEvolve in industrial production
- Google DeepMind ร EVE Online
- Manus โ Self-learning projects
- GitHub โ Trust Layer for Copilot agents
- GitHub โ Review PRs generated by agents
- xAI โ Image Generation Quality Mode API
- Z.ai โ GLM-5V-Turbo Tech Report
- NVIDIA DeepStream + Claude Code
- NVIDIA Guess-Verify-Refine
- TokenSpeed + NVIDIA Dynamo
- Ideogram BG Remover