ai-powered-markdown-translatorArticle translated from fr to en with gpt-5.4-mini.
On May 20, 2026, AI burst into fundamental mathematics: an OpenAI model refutes a conjecture by Paul Erdős posed in 1946, based on a 125-page proof validated by mathematicians including Fields Medalist Tim Gowers. On the model side, Cohere releases Command A+ open-source under Apache 2.0 (218B/25B active MoE architecture), NVIDIA launches Nemotron-Labs-Diffusion with parallel token generation, and Stability AI unveils Stable Audio 3.0 (4 open-weight models). On the tooling side, GitHub Copilot evolves on four fronts simultaneously, and Claude Code ships two versions in 24 hours.
OpenAI refutes an 80-year-old Erdős conjecture
May 20 — OpenAI has published a first-of-its-kind result: an internal general reasoning model solved the planar unit distance problem, an open question since Paul Erdős posed it in 1946. The problem asks for the maximum number of pairs of points at exactly distance 1 among n points in the plane. Since the 1940s, the mathematical community had believed that Erdős’s square-grid constructions were essentially optimal.
The model produced a proof showing the existence of an infinite family of configurations that exceeds the conjectured bound, with an exponent δ = 0.014 established by Will Sawin (Princeton). The breakthrough relies on an unexpected mathematical tool: infinite class field towers and Golod-Shafarevich theory, from algebraic number theory, applied to a basic Euclidean geometry problem. This connection between two a priori distant fields is, according to the mathematicians involved, the heart of the originality of the result.
| Aspect | Detail |
|---|---|
| Problem | Planar unit distances (Erdős, 1946) |
| Previous bound | Growth in n^(1+C/loglog(n)) (Spencer-Szemerédi-Trotter, 1984) |
| New result | n^(1+δ), δ = 0.014 |
| Mathematical tool | Algebraic number theory (Golod-Shafarevich) |
| Model | Internal general reasoning model (unnamed) |
| Chain-of-thought length | 125 pages |
| Validation | External mathematician group + companion paper |
What makes the result especially notable: it was not produced by a system trained specifically for mathematics or targeted at this problem. It is a general-purpose model, evaluated on a collection of Erdős problems as part of a broader exploration of autonomous research capabilities.
Tim Gowers (Fields Medal) calls the result a “milestone in AI mathematics”. Arul Shankar (Princeton) goes further:
“In my opinion this paper demonstrates that current AI models go beyond just helpers to human mathematicians – they are capable of having original ingenious ideas, and then carrying them out to fruition.” — Arul Shankar, number theorist, Princeton
OpenAI sees this result as a signal for fundamental research: if a model can sustain complex reasoning across 125 pages and connect distant mathematical domains, those capabilities are transferable to biology, physics, materials, and medicine.
Cohere Command A+ — open-source MoE flagship
May 20 — Cohere launches Command A+, its most powerful model to date, open source under the Apache 2.0 license. The mixture-of-experts architecture (sparse MoE) uses 218B total parameters but only 25B active at each inference, allowing it to run on two NVIDIA H100 GPUs or a single Blackwell (B200) GPU with W4A4 quantization.
Command A+ unifies, under a single model, the capabilities previously split across Command A Reasoning, Command A Vision, and Command A Translate. It supports 48 languages (up from 23 in previous versions), with an improved tokenizer for non-European languages (+20% for Arabic, +16% for Korean, +18% for Japanese).
| Benchmark | Command A+ | Command A Reasoning |
|---|---|---|
| τ²-Bench Telecom | 85% | 37% |
| Terminal-Bench Hard | 25% | 3% |
| MMMU | 75.1% | N/A |
| MathVista | 80.6% | 73.5% |
| North Agentic QA | +20% improvement | reference |
| North Data Analysis | +32% improvement | reference |
The model is up to 2× faster and 30% less latent than Command A Reasoning, with speculative decoding offering an additional 1.5–1.6× gain. Available on Hugging Face and via vLLM. A score of 37 on the Artificial Analysis Intelligence Index makes it the best among open-source models.
“Introducing: Cohere Command A+ — We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.” — @cohere on X
Gemini for Science — AI as a partner in scientific discovery
May 20 — Announced at Google I/O 2026 and tweeted on May 20, Gemini for Science is a suite of experimental tools for scientific research. In the face of exploding data volumes, the goal is to enable researchers to connect information that no individual can process alone.
Three experimental tools are unveiled:
| Tool | Base | Function |
|---|---|---|
| Hypothesis Generation | Co-Scientist | Discovery and refinement of new hypotheses |
| Computational Discovery | AlphaEvolve + ERA | Testing thousands of code variations in parallel |
| Science Skills | 30+ bio models | Integrated bundle for agentic platforms (Antigravity) |
Computational Discovery is the most technical tool: it generates and evaluates thousands of code variations in parallel, making it possible to test new modeling approaches in epidemiology, chemistry, or computational biology in a fraction of the usual time.
Science Skills integrates data from more than 30 major life-sciences models and databases, interfacing with agentic platforms to automate complex manual workflows in just minutes.
The project was developed with 100+ partner institutions, from PhD students to Nobel laureates.
NVIDIA Nemotron-Labs-Diffusion — Token-diffusion architecture
May 20 — NVIDIA announces Nemotron-Labs-Diffusion, a language model that generates tokens in parallel through diffusion, unlike classic autoregressive LLMs that produce one token at a time. This architecture — inspired by diffusion models for image generation — aims to speed up inference while maintaining output quality.
The approach is fundamentally different from the standard transformer paradigm: rather than sequentially predicting each token conditioned on the previous ones, the model iterates in parallel over an entire noisy token sequence until convergence. The theoretical advantages include lower latency on long outputs and better parallelization on GPUs.
| Aspect | Classic (autoregressive) | Nemotron-Labs-Diffusion |
|---|---|---|
| Generation | Token by token, sequential | Parallel across the whole sequence |
| Long-output latency | Grows linearly | Potentially reduced |
| Paradigm | GPT-style | Diffusion-style |
The full technical report accompanies the release. This is a research contribution from NVIDIA Labs, positioned as an architectural alternative to the dominant autoregressive transformer model — an active research area since the emergence of text diffusion models such as MDLM and Plaid.
Stability AI — Stable Audio 3.0 (open-weight family)
May 20 — Stability AI releases Stable Audio 3.0, a family of 4 open-weight audio models under a commercial license. The lineup covers the full deployment spectrum, from embedded devices to enterprise API.
| Model | Max length | Deployment | Open weights |
|---|---|---|---|
| Small SFX | short | on-device | Yes |
| Small | 2 min | on-device | Yes |
| Medium | 6:20 | cloud/local | Yes |
| Large | 6:20+ | API/enterprise | No |
The Small SFX, Small, and Medium models are available on Hugging Face. All training data is fully licensed, with announced partnerships with Universal Music Group and Warner Music Group. Advanced features include LoRA training support for custom fine-tuning, and an audio inpainting mode (single-segment editing, multi-segment, causal continuation).
“We want to foster the same kind of community-driven innovation in audio that we sparked in image generation with the launch of Stable Diffusion.” — Stability AI
GitHub Copilot evolves on four fronts
Adaptive auto model selection in VS Code
May 20 — Copilot’s “Auto” option in VS Code now selects the optimal model depending on the nature of the task: complex reasoning, simple code generation, debugging, or tool orchestration. The selection is based on real-time availability and reliability metrics. Practical benefit: a 10% reduction in the premium request multiplier when using Auto, with no configuration required.
Semantic natural-language issue search
May 20 — Copilot Chat on the web integrates a semantic issue index: a developer can search for “mobile rendering bugs reported last month” without knowing the exact title, and get results grouped by context. Available generally for all Copilot plans.
Removal of Gemini models from Copilot Chat web
May 20 — All Gemini models are removed from Copilot Chat on github.com, along with GPT-5.2 Codex and GPT-5.4 nano. Only OpenAI and Claude remain available on the web. GitHub justifies the choice by consistency in answer quality. Gemini remains available in IDEs and the API.
Fix with Copilot — batch application of code review feedback
May 19 — The “Implement suggestion” button is renamed “Fix with Copilot” with a new dialog (model choice, target branch, custom instructions). A new “Fix batch with Copilot” button makes it possible to group multiple code review comments and send them simultaneously to the Copilot cloud agent, reducing friction on PRs with many comments.
Claude Code v2.1.144 and v2.1.145
May 19 — Claude Code ships two versions in 24 hours with a substantial set of new features and fixes.
Version 2.1.144 improves background session management: the /resume command now shows --bg sessions, and sub-agent completion notifications include duration (e.g. “Agent completed · 3h 2m 5s”). The /model command applies only to the current session (press d to set the permanent default). The renaming from “extra usage” to “usage credits” clarifies terminology, and fixing a startup stall of up to 75 seconds when api.anthropic.com is unreachable (VPN, firewall) improves the enterprise experience.
Version 2.1.145 stands out for the introduction of claude agents --json, a command designed for integration into shell scripts (tmux-resurrect, status bars, session pickers). OpenTelemetry tracing is enhanced with agent_id and parent_agent_id in spans, enabling a correct sub-agent hierarchy. The /plugin screen now displays the full content (commands, agents, skills, hooks, MCP/LSP servers) before installation. Stop/SubagentStop hooks get two new fields: background_tasks and session_crons.
Anthropic opens the conversation on shaping AI character
May 19 — Anthropic has published an article detailing an initiative of regular dialogues with philosophers, clergy, and ethicists drawn from more than 15 religious and cultural traditions. The goal is to enrich reflection on what it means to shape the character of an AI system — drawing on centuries of accumulated thought about virtue and the good life, without aligning Claude to any particular tradition.
One experimental result is worth noting: a tool that Claude can invoke during a task to review its own ethical commitments. Used spontaneously before high-impact actions, it showed “a marked reduction in misaligned behaviors” in internal evaluations. Next steps will include discussions with legal experts, psychologists, and civic institutions.
Cohere — MOUs with Indra Group and Multiverse Computing
May 20 — Cohere signs two memoranda of understanding (MOU) during the state visit of King Felipe VI of Spain to Canada. The first brings Cohere together with IndraMind (the AI arm of the Spanish defense and digitization group Indra) to build a sovereign AI ecosystem including language adaptations for Spain’s five official languages. A defense component includes analysis and planning capabilities for multinational exercises. The second involves Multiverse Computing (quantum-inspired AI optimization, Spain/Canada) to explore business opportunities in Europe and Canada.
“Enterprises no longer want to rent AI — they want to own it.” — Aidan Gomez, co-founder and CEO of Cohere
Perplexity — Query-aware context compression in production
May 20 — Perplexity is deploying in production a query-aware context compression system that reduces context tokens by up to 70% while improving answer accuracy. The principle: a lightweight model surgically extracts the passages relevant to the query before passing them to the main LLM, eliminating ads, metadata, and off-topic content.
| Metric | Value |
|---|---|
| Context token reduction | up to 70% |
| Vital content gain per extract | +63% |
| Inference latency reduction | 35–40% |
| Aggregated GPU compute reduction | 40–45% |
| Production latency (p99) | < 20 ms |
The pplx-diffusion backbone (17 layers, distilled from 28 layers) predicts in parallel which segments to keep without generating text — an extractive approach that guarantees citation fidelity. On SimpleQA, the “medium” preset with compression reaches 95% accuracy with only 200 tokens on average per document.
ElevenLabs — Speech Engine, a vocal agent in one prompt
May 20 — ElevenLabs launches Speech Engine, a unified voice pipeline (speech synthesis + transcription + orchestration) allowing developers to turn a text conversational agent into a full voice agent with a single prompt. Available in ElevenAPI, pricing is 8 cents per minute with volume-based discounts. Migration is possible to ElevenAgents for additional deployment channels with monitoring and analytics.
Luma Agents integrates Seedance 2.0
May 19 — Luma Agents integrates Seedance 2.0, ByteDance’s video generation model, into its creative agents platform. Same workflow as the other models already integrated. This integration expands the choice of models accessible via Luma Agents, positioning the platform as a multi-model orchestration hub for AI video.
Kling AI at Cannes — House of David, the first Hollywood film with AI at industrial scale
May 20 — At the 2026 Cannes Film Festival, Kling AI confirms the industrial use of its technology in House of David (Prime Video): 44 million global viewers, top 10 new series in the United States, number 1 on Prime Video US. This is the first Hollywood production to publicly acknowledge the integration of AI video generation into its large-scale production pipeline, with coherent plans meeting strict industrial standards.
Briefs
-
Running Guide Agent — Google DeepMind — Personal AI agent dedicated to running training, presented as “a step toward limitless running.” 🔗 DeepMind blog
-
Midjourney V8.1 — flag
--noreintroduced — The anti-prompting flag is back in V8.1 to exclude elements from generated images (e.g.--no people). 🔗 @midjourney announcement -
Anthropic
/usagerevamped in Claude Code — Boris Cherny confirms an overhaul of the/usageUI to better visualize token consumption in response to a user. 🔗 source -
MiniMax Speech 2.8 Turbo — 600+ voices on Together AI — More than 600 new Speech 2.8 Turbo voices are now available on the Together AI platform. 🔗 @MiniMax_AI announcement
What this means
Fundamental research and autonomous AI. The resolution of Erdős’ conjecture by a general-purpose OpenAI model is not anecdotal. What strikes the mathematicians involved is the nature of the result: an unexpected connection between two branches of mathematics (algebraic number theory and discrete geometry), sustained over 125 pages of coherent reasoning. Coupled with Gemini for Science (developed with 100+ institutions), the trend is clear: AI is beginning to be integrated not just as a scientific data-processing tool, but as a discovery partner capable of generating original hypotheses.
Alternative architectures to the autoregressive paradigm. Two announcements today challenge the dominant GPT-style model. NVIDIA Nemotron-Labs-Diffusion generates tokens in parallel through diffusion rather than sequentially. Stability AI’s Stable Audio 3.0 demonstrates that diffusion produces high-quality musical results with open weights models across 4 deployment tiers. The convergence of these approaches suggests that diffusion is no longer confined to image generation — it is becoming a serious competing architecture for text and audio.
Sovereignty and enterprise AI. Command A+ (218B open-source MoE, Apache 2.0, 2× H100) and Cohere’s MOUs with Indra Group and Multiverse Computing illustrate a broader trend: large organizations — governments, defense, regulated sectors — want to deploy their models within their own infrastructure. The combination of an efficient MoE architecture (25B active out of 218B total) and an Apache 2.0 license makes Command A+ the best-positioned open-source model for sovereign deployments as of late May 2026.
Growing pressure on developer tooling. Claude Code 2.1.144 and 2.1.145, GitHub Copilot’s four simultaneous updates, and Perplexity’s context compression (-70% tokens, -40% GPU) are consistent signals: the competition is shifting from raw model quality toward tool ergonomics, scriptability (claude agents —json), inference cost (Auto model selection -10%, pplx-diffusion), and production robustness (fixing the VPN blockage in Claude Code).
Sources
- OpenAI — Model disproves discrete geometry conjecture
- OpenAI on X
- Cohere — Command A+ blog
- Cohere Command A+ on X
- Google AI — Gemini for Science on X
- NVIDIA AI — Nemotron-Labs-Diffusion on X
- Stability AI — Stable Audio 3.0
- GitHub Changelog — Auto model selection VS Code
- GitHub Changelog — Semantic issue search
- GitHub Changelog — Models available on the web
- GitHub Changelog — Fix with Copilot
- Claude Code CHANGELOG
- Anthropic — Widening the conversation on frontier AI
- Cohere — MOUs Indra and Multiverse Computing
- Perplexity — Query-aware context compression on X
- Perplexity — Research article
- ElevenLabs — Speech Engine on X
- Luma Labs — Seedance 2.0 on X
- Kling AI — Cannes on X
- Google DeepMind — Running Guide Agent
- Midjourney — Flag —no on X
- Boris Cherny — /usage revamped on X
- MiniMax — Speech 2.8 Turbo on X