Search

OpenAI Refutes an 80-Year-Old Erdős Conjecture, Cohere Command A+ Open Source, NVIDIA Nemotron-Labs-Diffusion

OpenAI Refutes an 80-Year-Old Erdős Conjecture, Cohere Command A+ Open Source, NVIDIA Nemotron-Labs-Diffusion

ai-powered-markdown-translator

Article translated from fr to en with gpt-5.4-mini.

View project on GitHub ↗

On May 20, 2026, AI burst into fundamental mathematics: an OpenAI model refutes a conjecture by Paul Erdős posed in 1946, based on a 125-page proof validated by mathematicians including Fields Medalist Tim Gowers. On the model side, Cohere releases Command A+ open-source under Apache 2.0 (218B/25B active MoE architecture), NVIDIA launches Nemotron-Labs-Diffusion with parallel token generation, and Stability AI unveils Stable Audio 3.0 (4 open-weight models). On the tooling side, GitHub Copilot evolves on four fronts simultaneously, and Claude Code ships two versions in 24 hours.


OpenAI refutes an 80-year-old Erdős conjecture

May 20 — OpenAI has published a first-of-its-kind result: an internal general reasoning model solved the planar unit distance problem, an open question since Paul Erdős posed it in 1946. The problem asks for the maximum number of pairs of points at exactly distance 1 among n points in the plane. Since the 1940s, the mathematical community had believed that Erdős’s square-grid constructions were essentially optimal.

The model produced a proof showing the existence of an infinite family of configurations that exceeds the conjectured bound, with an exponent δ = 0.014 established by Will Sawin (Princeton). The breakthrough relies on an unexpected mathematical tool: infinite class field towers and Golod-Shafarevich theory, from algebraic number theory, applied to a basic Euclidean geometry problem. This connection between two a priori distant fields is, according to the mathematicians involved, the heart of the originality of the result.

AspectDetail
ProblemPlanar unit distances (Erdős, 1946)
Previous boundGrowth in n^(1+C/loglog(n)) (Spencer-Szemerédi-Trotter, 1984)
New resultn^(1+δ), δ = 0.014
Mathematical toolAlgebraic number theory (Golod-Shafarevich)
ModelInternal general reasoning model (unnamed)
Chain-of-thought length125 pages
ValidationExternal mathematician group + companion paper

What makes the result especially notable: it was not produced by a system trained specifically for mathematics or targeted at this problem. It is a general-purpose model, evaluated on a collection of Erdős problems as part of a broader exploration of autonomous research capabilities.

Tim Gowers (Fields Medal) calls the result a “milestone in AI mathematics”. Arul Shankar (Princeton) goes further:

“In my opinion this paper demonstrates that current AI models go beyond just helpers to human mathematicians – they are capable of having original ingenious ideas, and then carrying them out to fruition.” — Arul Shankar, number theorist, Princeton

OpenAI sees this result as a signal for fundamental research: if a model can sustain complex reasoning across 125 pages and connect distant mathematical domains, those capabilities are transferable to biology, physics, materials, and medicine.

🔗 OpenAI article


Cohere Command A+ — open-source MoE flagship

May 20 — Cohere launches Command A+, its most powerful model to date, open source under the Apache 2.0 license. The mixture-of-experts architecture (sparse MoE) uses 218B total parameters but only 25B active at each inference, allowing it to run on two NVIDIA H100 GPUs or a single Blackwell (B200) GPU with W4A4 quantization.

Command A+ unifies, under a single model, the capabilities previously split across Command A Reasoning, Command A Vision, and Command A Translate. It supports 48 languages (up from 23 in previous versions), with an improved tokenizer for non-European languages (+20% for Arabic, +16% for Korean, +18% for Japanese).

BenchmarkCommand A+Command A Reasoning
τ²-Bench Telecom85%37%
Terminal-Bench Hard25%3%
MMMU75.1%N/A
MathVista80.6%73.5%
North Agentic QA+20% improvementreference
North Data Analysis+32% improvementreference

The model is up to 2× faster and 30% less latent than Command A Reasoning, with speculative decoding offering an additional 1.5–1.6× gain. Available on Hugging Face and via vLLM. A score of 37 on the Artificial Analysis Intelligence Index makes it the best among open-source models.

“Introducing: Cohere Command A+ — We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.” — @cohere on X

🔗 Cohere blog


Gemini for Science — AI as a partner in scientific discovery

May 20 — Announced at Google I/O 2026 and tweeted on May 20, Gemini for Science is a suite of experimental tools for scientific research. In the face of exploding data volumes, the goal is to enable researchers to connect information that no individual can process alone.

Three experimental tools are unveiled:

ToolBaseFunction
Hypothesis GenerationCo-ScientistDiscovery and refinement of new hypotheses
Computational DiscoveryAlphaEvolve + ERATesting thousands of code variations in parallel
Science Skills30+ bio modelsIntegrated bundle for agentic platforms (Antigravity)

Computational Discovery is the most technical tool: it generates and evaluates thousands of code variations in parallel, making it possible to test new modeling approaches in epidemiology, chemistry, or computational biology in a fraction of the usual time.

Science Skills integrates data from more than 30 major life-sciences models and databases, interfacing with agentic platforms to automate complex manual workflows in just minutes.

The project was developed with 100+ partner institutions, from PhD students to Nobel laureates.

🔗 @GoogleAI announcement


NVIDIA Nemotron-Labs-Diffusion — Token-diffusion architecture

May 20 — NVIDIA announces Nemotron-Labs-Diffusion, a language model that generates tokens in parallel through diffusion, unlike classic autoregressive LLMs that produce one token at a time. This architecture — inspired by diffusion models for image generation — aims to speed up inference while maintaining output quality.

The approach is fundamentally different from the standard transformer paradigm: rather than sequentially predicting each token conditioned on the previous ones, the model iterates in parallel over an entire noisy token sequence until convergence. The theoretical advantages include lower latency on long outputs and better parallelization on GPUs.

AspectClassic (autoregressive)Nemotron-Labs-Diffusion
GenerationToken by token, sequentialParallel across the whole sequence
Long-output latencyGrows linearlyPotentially reduced
ParadigmGPT-styleDiffusion-style

The full technical report accompanies the release. This is a research contribution from NVIDIA Labs, positioned as an architectural alternative to the dominant autoregressive transformer model — an active research area since the emergence of text diffusion models such as MDLM and Plaid.

🔗 @NVIDIAAI announcement


Stability AI — Stable Audio 3.0 (open-weight family)

May 20 — Stability AI releases Stable Audio 3.0, a family of 4 open-weight audio models under a commercial license. The lineup covers the full deployment spectrum, from embedded devices to enterprise API.

ModelMax lengthDeploymentOpen weights
Small SFXshorton-deviceYes
Small2 minon-deviceYes
Medium6:20cloud/localYes
Large6:20+API/enterpriseNo

The Small SFX, Small, and Medium models are available on Hugging Face. All training data is fully licensed, with announced partnerships with Universal Music Group and Warner Music Group. Advanced features include LoRA training support for custom fine-tuning, and an audio inpainting mode (single-segment editing, multi-segment, causal continuation).

“We want to foster the same kind of community-driven innovation in audio that we sparked in image generation with the launch of Stable Diffusion.” — Stability AI


GitHub Copilot evolves on four fronts

Adaptive auto model selection in VS Code

May 20 — Copilot’s “Auto” option in VS Code now selects the optimal model depending on the nature of the task: complex reasoning, simple code generation, debugging, or tool orchestration. The selection is based on real-time availability and reliability metrics. Practical benefit: a 10% reduction in the premium request multiplier when using Auto, with no configuration required.

🔗 GitHub changelog

May 20 — Copilot Chat on the web integrates a semantic issue index: a developer can search for “mobile rendering bugs reported last month” without knowing the exact title, and get results grouped by context. Available generally for all Copilot plans.

🔗 GitHub changelog

Removal of Gemini models from Copilot Chat web

May 20 — All Gemini models are removed from Copilot Chat on github.com, along with GPT-5.2 Codex and GPT-5.4 nano. Only OpenAI and Claude remain available on the web. GitHub justifies the choice by consistency in answer quality. Gemini remains available in IDEs and the API.

🔗 GitHub changelog

Fix with Copilot — batch application of code review feedback

May 19 — The “Implement suggestion” button is renamed “Fix with Copilot” with a new dialog (model choice, target branch, custom instructions). A new “Fix batch with Copilot” button makes it possible to group multiple code review comments and send them simultaneously to the Copilot cloud agent, reducing friction on PRs with many comments.

🔗 GitHub changelog


Claude Code v2.1.144 and v2.1.145

May 19 — Claude Code ships two versions in 24 hours with a substantial set of new features and fixes.

Version 2.1.144 improves background session management: the /resume command now shows --bg sessions, and sub-agent completion notifications include duration (e.g. “Agent completed · 3h 2m 5s”). The /model command applies only to the current session (press d to set the permanent default). The renaming from “extra usage” to “usage credits” clarifies terminology, and fixing a startup stall of up to 75 seconds when api.anthropic.com is unreachable (VPN, firewall) improves the enterprise experience.

Version 2.1.145 stands out for the introduction of claude agents --json, a command designed for integration into shell scripts (tmux-resurrect, status bars, session pickers). OpenTelemetry tracing is enhanced with agent_id and parent_agent_id in spans, enabling a correct sub-agent hierarchy. The /plugin screen now displays the full content (commands, agents, skills, hooks, MCP/LSP servers) before installation. Stop/SubagentStop hooks get two new fields: background_tasks and session_crons.

🔗 Claude Code CHANGELOG


Anthropic opens the conversation on shaping AI character

May 19 — Anthropic has published an article detailing an initiative of regular dialogues with philosophers, clergy, and ethicists drawn from more than 15 religious and cultural traditions. The goal is to enrich reflection on what it means to shape the character of an AI system — drawing on centuries of accumulated thought about virtue and the good life, without aligning Claude to any particular tradition.

One experimental result is worth noting: a tool that Claude can invoke during a task to review its own ethical commitments. Used spontaneously before high-impact actions, it showed “a marked reduction in misaligned behaviors” in internal evaluations. Next steps will include discussions with legal experts, psychologists, and civic institutions.

🔗 Anthropic article


Cohere — MOUs with Indra Group and Multiverse Computing

May 20 — Cohere signs two memoranda of understanding (MOU) during the state visit of King Felipe VI of Spain to Canada. The first brings Cohere together with IndraMind (the AI arm of the Spanish defense and digitization group Indra) to build a sovereign AI ecosystem including language adaptations for Spain’s five official languages. A defense component includes analysis and planning capabilities for multinational exercises. The second involves Multiverse Computing (quantum-inspired AI optimization, Spain/Canada) to explore business opportunities in Europe and Canada.

“Enterprises no longer want to rent AI — they want to own it.” — Aidan Gomez, co-founder and CEO of Cohere


Perplexity — Query-aware context compression in production

May 20 — Perplexity is deploying in production a query-aware context compression system that reduces context tokens by up to 70% while improving answer accuracy. The principle: a lightweight model surgically extracts the passages relevant to the query before passing them to the main LLM, eliminating ads, metadata, and off-topic content.

MetricValue
Context token reductionup to 70%
Vital content gain per extract+63%
Inference latency reduction35–40%
Aggregated GPU compute reduction40–45%
Production latency (p99)< 20 ms

The pplx-diffusion backbone (17 layers, distilled from 28 layers) predicts in parallel which segments to keep without generating text — an extractive approach that guarantees citation fidelity. On SimpleQA, the “medium” preset with compression reaches 95% accuracy with only 200 tokens on average per document.

🔗 Perplexity announcement


ElevenLabs — Speech Engine, a vocal agent in one prompt

May 20 — ElevenLabs launches Speech Engine, a unified voice pipeline (speech synthesis + transcription + orchestration) allowing developers to turn a text conversational agent into a full voice agent with a single prompt. Available in ElevenAPI, pricing is 8 cents per minute with volume-based discounts. Migration is possible to ElevenAgents for additional deployment channels with monitoring and analytics.

🔗 ElevenLabs announcement


Luma Agents integrates Seedance 2.0

May 19 — Luma Agents integrates Seedance 2.0, ByteDance’s video generation model, into its creative agents platform. Same workflow as the other models already integrated. This integration expands the choice of models accessible via Luma Agents, positioning the platform as a multi-model orchestration hub for AI video.

🔗 Luma announcement


Kling AI at Cannes — House of David, the first Hollywood film with AI at industrial scale

May 20 — At the 2026 Cannes Film Festival, Kling AI confirms the industrial use of its technology in House of David (Prime Video): 44 million global viewers, top 10 new series in the United States, number 1 on Prime Video US. This is the first Hollywood production to publicly acknowledge the integration of AI video generation into its large-scale production pipeline, with coherent plans meeting strict industrial standards.

🔗 Kling AI announcement


Briefs

  • Running Guide Agent — Google DeepMind — Personal AI agent dedicated to running training, presented as “a step toward limitless running.” 🔗 DeepMind blog

  • Midjourney V8.1 — flag --no reintroduced — The anti-prompting flag is back in V8.1 to exclude elements from generated images (e.g. --no people). 🔗 @midjourney announcement

  • Anthropic /usage revamped in Claude Code — Boris Cherny confirms an overhaul of the /usage UI to better visualize token consumption in response to a user. 🔗 source

  • MiniMax Speech 2.8 Turbo — 600+ voices on Together AI — More than 600 new Speech 2.8 Turbo voices are now available on the Together AI platform. 🔗 @MiniMax_AI announcement


What this means

Fundamental research and autonomous AI. The resolution of Erdős’ conjecture by a general-purpose OpenAI model is not anecdotal. What strikes the mathematicians involved is the nature of the result: an unexpected connection between two branches of mathematics (algebraic number theory and discrete geometry), sustained over 125 pages of coherent reasoning. Coupled with Gemini for Science (developed with 100+ institutions), the trend is clear: AI is beginning to be integrated not just as a scientific data-processing tool, but as a discovery partner capable of generating original hypotheses.

Alternative architectures to the autoregressive paradigm. Two announcements today challenge the dominant GPT-style model. NVIDIA Nemotron-Labs-Diffusion generates tokens in parallel through diffusion rather than sequentially. Stability AI’s Stable Audio 3.0 demonstrates that diffusion produces high-quality musical results with open weights models across 4 deployment tiers. The convergence of these approaches suggests that diffusion is no longer confined to image generation — it is becoming a serious competing architecture for text and audio.

Sovereignty and enterprise AI. Command A+ (218B open-source MoE, Apache 2.0, 2× H100) and Cohere’s MOUs with Indra Group and Multiverse Computing illustrate a broader trend: large organizations — governments, defense, regulated sectors — want to deploy their models within their own infrastructure. The combination of an efficient MoE architecture (25B active out of 218B total) and an Apache 2.0 license makes Command A+ the best-positioned open-source model for sovereign deployments as of late May 2026.

Growing pressure on developer tooling. Claude Code 2.1.144 and 2.1.145, GitHub Copilot’s four simultaneous updates, and Perplexity’s context compression (-70% tokens, -40% GPU) are consistent signals: the competition is shifting from raw model quality toward tool ergonomics, scriptability (claude agents —json), inference cost (Auto model selection -10%, pplx-diffusion), and production robustness (fixing the VPN blockage in Claude Code).


Sources