Anthropic unveils recursive AI self-improvement, NVIDIA Nemotron 3 Ultra open-source, Suno raises $400M

June 4, 2026 opens with a historic publication from the Anthropic Institute: AI is already accelerating its own development, with more than 80% of Anthropic’s code written by Claude and an 8× productivity gain. At the same time, NVIDIA ships Nemotron 3 Ultra, a 550-billion-parameter MoE model that is fully open-source for agents. OpenAI deploys Dreaming v3, a new memory architecture for ChatGPT that is 5× cheaper. GitHub Copilot crosses the threshold of one million context tokens. And Suno announces a Series D funding round of $400 million, bringing its valuation to $5.4 billion.

Anthropic Institute — “When AI builds itself”: documented recursive self-improvement

June 4 — The Anthropic Institute publishes “When AI builds itself”, the first official documentation, backed by internal numbers, of progress toward a possible recursive self-improvement of AI. Co-authors Marina Favaro and Jack Clark present internal data from May 2026 showing that Claude now writes the majority of Anthropic’s code.

Indicator	Value (May 2026)
Share of Anthropic code written by Claude	>80% of lines merged into production
Code/engineer productivity gain	×8 in Q2 2026 vs 2024
Success rate for open tasks	76% (+50 points in 6 months)
Code optimization speedup (Mythos Preview)	~52× vs ~3× for Opus 4 (May 2025)
Research decisions better than humans	64% (Mythos Preview vs 51% for Opus 4.5 in Nov. 2025)
Internal survey — estimated productivity gain	×4 with Mythos Preview (130 employees, March 2026)

The progression of autonomous task duration is particularly striking: Claude Opus 3 handled tasks of about 4 minutes in March 2024, Claude Sonnet 3.7 reached 1h30 in March 2025, Claude Opus 4.6 operates for 12 hours in March 2026, and Mythos Preview exceeds 16 hours (the METR benchmark measurement limit) in May 2026. The duration doubles roughly every 4 months.

One concrete result: in April 2026, Claude agents resolved an open AI security problem end to end — hypotheses, tests, iterations — and recovered 97% of the performance gain, compared with 23% for two human researchers over a week, for an estimated $18,000 in compute over 800 cumulative hours.

The article explores three scenarios: a plateau (considered the least likely), substantial automation with strategic human direction, and full recursive self-improvement where models build their successors without human intervention. The article concludes with an explicit call for a coordinated and verifiable pause in frontier AI development, contingent on the participation of the other major labs.

“Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor. It’s happening faster than we thought, and the implications deserve greater attention.” — @AnthropicAI

🔗 Anthropic Institute article

NVIDIA Nemotron 3 Ultra — 550B MoE open-source for long-running agents

June 4 — NVIDIA ships Nemotron 3 Ultra, a 550-billion-parameter open-source frontier model designed specifically for long-running AI agents. This launch is the effective realization of the open-source weights — after the initial announcement at Microsoft Build on June 2, the weights are now available on HuggingFace and via Ollama Cloud.

Feature	Value
Architecture	Hybrid Mamba-Transformer MoE
Total parameters	550 billion
Active parameters	55 billion (NVFP4)
Inference speed	5× faster than comparable open-source frontier models
Agentic cost reduction	-30%
HuggingFace weights	`nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4`
Cloud access	Ollama Cloud
Supported agent harnesses	OpenClaw, Hermes Agent (NousResearch), LangChain

The hybrid Mamba-Transformer MoE architecture makes it possible to run more reasoning cycles within the same time budget — that is what explains the speed gain. Nemotron 3 Ultra is post-trained for complex tasks: advanced coding, deep research, planning, tool use, and recovery after failures.

NVIDIA also publishes the synthetic data and the post-training recipes, which allows external teams to reproduce or refine the process.

“Today we’re shipping Nemotron 3 Ultra. A 550B MoE frontier-intelligence open model built for long-running agents. It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models.” — @NVIDIAAI

🔗 HuggingFace — Nemotron 3 Ultra

Dreaming v3 — a new memory architecture for ChatGPT

June 4 — OpenAI rolls out Dreaming v3, a fully redesigned autonomous memory architecture for ChatGPT. The system solves the three limits of the previous mechanism: information freshness, its correction over time, and scaling to Free users.

Objective	Description
Context continuity	Remember information once, reuse it in future conversations
Preference adherence	Apply personal constraints (diet, time zone, etc.)
Time-based updates	Automatically revise memories — after a trip, the AI knows you are back

The system history: saved memories arrived in April 2024 (manual declaration), then “Dreaming v0” in April 2025 (automatic background synthesis). Version 3 is architecturally autonomous, ~5× cheaper to serve — and that cost reduction made expansion to Free users possible.

Availability: today for Plus and Pro subscribers in the United States; rollout to other countries and to Free and Go users is planned for the coming weeks.

A “Memory Summary” page makes it possible to view a readable summary of what ChatGPT knows about the user, add or correct information, and define rules about the topics to address.

🔗 openai.com — Dreaming

Suno Series D — $400 million at a $5.4 billion valuation

June 3 — Suno announces a Series D round of $400 million, bringing the generative music platform’s valuation to $5.4 billion. The round is led by Bond Capital, with participation from new investors (IVP, USV — Union Square Ventures, Forerunner Ventures) and renewed support from Matrix VC, Lightspeed, and Menlo Ventures.

The valuation trajectory is remarkable: $125 million raised in May 2024, then $250 million in November 2025 at a $2.45 billion valuation, and now $5.4 billion — more than double in seven months.

Suno is establishing itself as one of the world’s best-funded AI creative platforms, with a mission that has remained constant: to let more people experience the joy of making music, without technical constraints.

“We’re thrilled to announce Suno’s next chapter: a $400M Series D at a $5.4B valuation!” — @suno

🔗 Suno Blog — The Next Chapter

GitHub Copilot — 1M-token context window and configurable reasoning

June 4 — GitHub Copilot adds two major new capabilities available today in VS Code, Copilot CLI, and the GitHub Copilot app.

Capability	Availability	AI credits impact
1M-token context window	VS Code, Copilot CLI, Copilot app	Higher
Configurable reasoning levels	VS Code, Copilot CLI, Copilot app	Higher

The one-million-token context window makes it possible to work on larger codebases, longer documents, and complex multi-file projects without losing the thread. Until now, context limits forced developers to break up their work or simplify their requests on complex projects.

Configurable reasoning levels make it possible to calibrate the speed/depth balance and activate “extended thinking” for the most complex architectural and debugging challenges. GitHub recommends reserving the highest settings for complex multi-file problems — using extended context or higher reasoning consumes more AI credits per interaction.

Expansion to additional Copilot surfaces is planned for the coming weeks.

🔗 GitHub Changelog — Larger context windows

GitHub Copilot — June 4 updates

Copilot in Visual Studio — May 2026 update

June 4 — The May 2026 update for Copilot in Visual Studio 2026 strengthens planning and collaborative review.

Agent Plan: explores the repository in read-only mode, asks clarification questions, and generates a detailed plan saved in .copilot/plans/plan-{titre}.md. An “Implement plan” button switches to agent mode.
Skills panel: lists all detected agent skills from the workspace and user profile, with search by name or keyword.
Multi-file summary diff: after Copilot changes across multiple files, a “change summary” view to accept or discard changes globally, by file, or by block.
Context window usage indicator: icon at the top of the input box with a “Summarize conversation” option to free up space.
Commit context addition: right-click a commit in Git History to attach it as context in Copilot Chat.

🔗 GitHub Changelog — Visual Studio May update

Copilot Chat on github.com — enriched PR context (general availability)

June 4 — Copilot Chat moves from public preview to general availability for all Copilot license holders, with enhanced capabilities when working on diffs and pull requests on github.com.

Code and chat side by side: view the conversation directly next to the code, comments, and inline changes without switching between the PR and the chat window.
Automatically loaded context: when a question concerns a diff or a PR, the relevant context is injected automatically — no more copying and pasting excerpts.
Access: the “Ask about this diff” button at the top of each diff, or via the dropdown menu when highlighting a line of code.

🔗 GitHub Changelog — Copilot Chat PR context

Claude Code v2.1.162

June 3 — Version v2.1.162 of Claude Code brings several UX improvements and important agent fixes.

Feature	Description
`claude agents --json` + `waitingFor`	JSON now includes the blocking pattern for a waiting session (e.g. permission prompt)
`/effort` persistence confirmed	Explicit confirmation when the chosen level becomes the default for new sessions
Autocomplete slash command	A click fills the command in the prompt without executing it — Enter to confirm
Remote Control footer pill	Remote Control appears as a persistent pill at the bottom with a link to the session
Windsurf → Devin Desktop rename	Updated in `/ide`, `/terminal-setup`, `/scroll-speed`

Among the fixes: silent blocking at startup if the config directory is read-only (Claude Code now starts with an in-memory config), WebFetch rules not applied on pre-approved domains, Windows permissions with backslashes, and several agent fixes (Ctrl+V images, sessions lost while backgrounding, terminal width on long sessions).

🔗 Claude Code v2.1.162 releases

ElevenLabs — Flows Agent and Hasbro partnership

Flows Agent in ElevenCreative

June 4 — ElevenLabs launches Flows Agent in its ElevenCreative interface. The user describes what they want to create and the agent automatically builds the complete pipeline — connecting more than 50 image and video models to the voice, music, and sound effects tools available on the platform, on a single unified canvas.

An “assist” mode lets the agent ask for approval before each paid operation to keep costs under control. Marketing teams can thus chain modalities and test creative variants across different products, languages, and formats without manually configuring each step.

🔗 ElevenLabs Flows

ElevenLabs × Hasbro — licensed character voices in the Iconic Marketplace

June 3 — ElevenLabs partners with Hasbro to offer official character voices (My Little Pony, Transformers, G.I. Joe) through the Iconic Marketplace. The voices are built in partnership with Hasbro and the original voice talents, with clearly defined usage rights for developers, companies, and application creators. The offering aims to combine AI creativity with protection of brands’ intellectual property rights.

🔗 ElevenLabs × Hasbro tweet

GPT-Rosalind — new capabilities for the life sciences

June 3 — OpenAI announces a major update to GPT-Rosalind, its specialized model for life sciences research at enterprise scale. The model combines GPT-5.5’s agentic capabilities with stronger intelligence in medicinal chemistry and genomics.

Benchmark	Domain	GPT-Rosalind score	GPT-5.5 score	Token reduction
LifeSciBench	Life sciences (6 domains)	Best	—	—
MedChemBench	Medicinal chemistry	27.5%	25.1%	-7.2%
GeneBench	Genomics	21.6%	20.4%	-31%
LabWorkBench	Wet-lab protocols	63.2%	55.8%	-5.3%

Two new plugins are now available to all Codex users: Life Sciences Research plugin (sourced evidence retrieval) and Life Sciences NGS Analysis plugin (scRNA-seq, bulk RNA-seq bioinformatics workflows). Novo Nordisk is the first announced partner. Access is being expanded globally to qualified organizations (legitimate scientific research, strong governance).

🔗 openai.com — GPT-Rosalind

Perplexity launches the Main Street AI Accelerator with the U.S. SBA

June 4 — Perplexity launches the Main Street AI Accelerator in partnership with the U.S. Small Business Administration (SBA). The program makes $25 million in Perplexity Computer credits available: $250 in credits for up to 100,000 eligible businesses, in reference to the 250th anniversary of the United States.

Businesses that benefit from SBA 7(a), 504, and microloan programs are eligible. Applications are not yet open; a waitlist is available on the dedicated page. The initiative fits into Perplexity’s strategy to bring Computer to the U.S. local economy, after announcing the previous week 400+ enterprise integrations for Computer (Intuit QuickBooks, Vercel, Shopify, Canva).

🔗 Main Street AI Accelerator

Cohere wins 1st place in the NATO challenge on agentic AI

June 4 — Cohere takes first place in the NATO Agentic AI for Cognitive Warfare Innovation Challenge competition. The full podium:

Cohere (1st place)
OpenMinds (2nd place)
Ipsos & Thoughtworks (3rd place, tied)

The competition highlights the growing role of agentic AI in helping democratic nations understand, anticipate, and respond to information threats. For Cohere, this NATO recognition confirms its positioning on sovereign AI for the defense and government sectors — a major focus since its transatlantic merger with Aleph Alpha in April 2026.

🔗 Cohere — NATO Challenge

Pika — Group Chat with an AI agent on iOS

June 4 — Pika launches the first Group Chat integrated with an AI agent in its app. Users invite their contacts into a group chat where the Pika Agent joins the creative conversation — helping set up a phone, create group memes, and collaborate on short-form video content. Available now on iOS via https://pika.me.

🔗 Pika Tweet

Briefs

Anthropic article — self-service analytics with Claude — The Anthropic team publishes its best practices for building self-service data analysis agents with Claude: skills, data foundations, and evaluations. 🔗 Claude Blog
Google Antigravity v2.0.11 — Stability patch for the Gemini-powered IDE: two fixes (startup hangs and the “Open IDE” button), no new features. 🔗 Antigravity Changelog
GitHub Enterprise Teams GA — Enterprise Teams reaches general availability on GitHub Enterprise Cloud: groups defined once at the enterprise level, assignable across all organizations, with SCIM, GitHub Apps, and full auditing. Up to 2,500 teams and 5,000 members per team. 🔗 GitHub Changelog
Genspark — launch partner for Agent365 at Microsoft Build — Co-founder Ray Zhong took the stage at Microsoft Build as a global strategic partner and launch partner for Agent365, bringing agentic AI into Microsoft’s existing enterprise infrastructure. 🔗 Genspark Tweet
Cohere backs Canada’s national AI strategy — CEO Aidan Gomez reaffirms Cohere’s Canadian roots, praising Canada’s new national AI strategy as an important step toward technological sovereignty and building next-generation AI in the country. 🔗 Cohere Tweet

What it means

AI self-improvement is moving from a theoretical scenario to measured internal data. The Anthropic Institute publication is not speculation — it is a field report with precise numbers: >80% of code, 8× productivity, 76% success rate on open-ended tasks. The autonomous task duration doubling every 4 months is the most concrete signal of the ongoing dynamic. What was discussed in AI safety circles as a future risk is now documented as a present reality. The call for a coordinated pause — with Anthropic as the first requester — illustrates the tension between commercial competition and regulatory caution.

Frontier-level open source is changing scale. Nemotron 3 Ultra at 550 billion parameters — fully open source, downloadable weights, synthetic data, and published recipes — redefines what “open source” means for frontier models. The 5× faster inference speed and 30% cost reduction for agentic tasks are not marginal: they make it viable to run complex agents outside large proprietary clouds. For teams building autonomous agents, this is a new infrastructure layer taking shape.

Developer tooling is consolidating around the long-running agent. GitHub Copilot with 1M tokens of context, Claude Code v2.1.162 with waitingFor in the agents JSON, ElevenLabs’ Flows Agent building multimodal pipelines — these three announcements share the same paradigm: the agent must manage long contexts, communicate its state to other systems, and orchestrate multiple tools without human intervention. Copilot’s “configurable reasoning” and Claude Code’s persistent /effort answer the same question: how can the user tune the depth of reasoning according to task complexity?

The creative AI economy is reaching a symbolic milestone. Suno’s $5.4 billion valuation in seven months — with a doubling in value — signals that investors are betting on a generative music creation platform at consumer scale. Combined with recent fundraising in video (Runway, Pika), the AI creation sector now has a capitalization comparable to that of major traditional creative software publishers. Hasbro’s entry into ElevenLabs’ Iconic Marketplace shows how intellectual property holders are adapting: rather than blocking AI, they are monetizing it under license.