OpenAI Refutes an 80-Year-Old Erdős Conjecture, Cohere Command A+ Open Source, NVIDIA Nemotron-Labs-Diffusion

On May 20, 2026, AI burst into fundamental mathematics: an OpenAI model refutes a conjecture by Paul Erdős posed in 1946, based on a 125-page proof validated by mathematicians including Fields Medalist Tim Gowers. On the model side, Cohere releases Command A+ open-source under Apache 2.0 (218B/25B active MoE architecture), NVIDIA launches Nemotron-Labs-Diffusion with parallel token generation, and Stability AI unveils Stable Audio 3.0 (4 open-weight models). On the tooling side, GitHub Copilot evolves on four fronts simultaneously, and Claude Code ships two versions in 24 hours.

OpenAI refutes an 80-year-old Erdős conjecture

May 20 — OpenAI has published a first-of-its-kind result: an internal general reasoning model solved the planar unit distance problem, an open question since Paul Erdős posed it in 1946. The problem asks for the maximum number of pairs of points at exactly distance 1 among n points in the plane. Since the 1940s, the mathematical community had believed that Erdős’s square-grid constructions were essentially optimal.

The model produced a proof showing the existence of an infinite family of configurations that exceeds the conjectured bound, with an exponent δ = 0.014 established by Will Sawin (Princeton). The breakthrough relies on an unexpected mathematical tool: infinite class field towers and Golod-Shafarevich theory, from algebraic number theory, applied to a basic Euclidean geometry problem. This connection between two a priori distant fields is, according to the mathematicians involved, the heart of the originality of the result.

Aspect	Detail
Problem	Planar unit distances (Erdős, 1946)
Previous bound	Growth in n^(1+C/loglog(n)) (Spencer-Szemerédi-Trotter, 1984)
New result	n^(1+δ), δ = 0.014
Mathematical tool	Algebraic number theory (Golod-Shafarevich)
Model	Internal general reasoning model (unnamed)
Chain-of-thought length	125 pages
Validation	External mathematician group + companion paper

What makes the result especially notable: it was not produced by a system trained specifically for mathematics or targeted at this problem. It is a general-purpose model, evaluated on a collection of Erdős problems as part of a broader exploration of autonomous research capabilities.

Tim Gowers (Fields Medal) calls the result a “milestone in AI mathematics”. Arul Shankar (Princeton) goes further:

“In my opinion this paper demonstrates that current AI models go beyond just helpers to human mathematicians – they are capable of having original ingenious ideas, and then carrying them out to fruition.” — Arul Shankar, number theorist, Princeton

OpenAI sees this result as a signal for fundamental research: if a model can sustain complex reasoning across 125 pages and connect distant mathematical domains, those capabilities are transferable to biology, physics, materials, and medicine.

🔗 OpenAI article

Cohere Command A+ — open-source MoE flagship

May 20 — Cohere launches Command A+, its most powerful model to date, open source under the Apache 2.0 license. The mixture-of-experts architecture (sparse MoE) uses 218B total parameters but only 25B active at each inference, allowing it to run on two NVIDIA H100 GPUs or a single Blackwell (B200) GPU with W4A4 quantization.

Command A+ unifies, under a single model, the capabilities previously split across Command A Reasoning, Command A Vision, and Command A Translate. It supports 48 languages (up from 23 in previous versions), with an improved tokenizer for non-European languages (+20% for Arabic, +16% for Korean, +18% for Japanese).

Benchmark	Command A+	Command A Reasoning
τ²-Bench Telecom	85%	37%
Terminal-Bench Hard	25%	3%
MMMU	75.1%	N/A
MathVista	80.6%	73.5%
North Agentic QA	+20% improvement	reference
North Data Analysis	+32% improvement	reference

The model is up to 2× faster and 30% less latent than Command A Reasoning, with speculative decoding offering an additional 1.5–1.6× gain. Available on Hugging Face and via vLLM. A score of 37 on the Artificial Analysis Intelligence Index makes it the best among open-source models.

“Introducing: Cohere Command A+ — We’ve created our most powerful LLM yet, optimized it to run on as little hardware as possible, and released it open-source for all.” — @cohere on X

🔗 Cohere blog

Gemini for Science — AI as a partner in scientific discovery

May 20 — Announced at Google I/O 2026 and tweeted on May 20, Gemini for Science is a suite of experimental tools for scientific research. In the face of exploding data volumes, the goal is to enable researchers to connect information that no individual can process alone.

Three experimental tools are unveiled:

Tool	Base	Function
Hypothesis Generation	Co-Scientist	Discovery and refinement of new hypotheses
Computational Discovery	AlphaEvolve + ERA	Testing thousands of code variations in parallel
Science Skills	30+ bio models	Integrated bundle for agentic platforms (Antigravity)

Computational Discovery is the most technical tool: it generates and evaluates thousands of code variations in parallel, making it possible to test new modeling approaches in epidemiology, chemistry, or computational biology in a fraction of the usual time.

Science Skills integrates data from more than 30 major life-sciences models and databases, interfacing with agentic platforms to automate complex manual workflows in just minutes.

The project was developed with 100+ partner institutions, from PhD students to Nobel laureates.

🔗 @GoogleAI announcement

NVIDIA Nemotron-Labs-Diffusion — Token-diffusion architecture

May 20 — NVIDIA announces Nemotron-Labs-Diffusion, a language model that generates tokens in parallel through diffusion, unlike classic autoregressive LLMs that produce one token at a time. This architecture — inspired by diffusion models for image generation — aims to speed up inference while maintaining output quality.

The approach is fundamentally different from the standard transformer paradigm: rather than sequentially predicting each token conditioned on the previous ones, the model iterates in parallel over an entire noisy token sequence until convergence. The theoretical advantages include lower latency on long outputs and better parallelization on GPUs.

Aspect	Classic (autoregressive)	Nemotron-Labs-Diffusion
Generation	Token by token, sequential	Parallel across the whole sequence
Long-output latency	Grows linearly	Potentially reduced
Paradigm	GPT-style	Diffusion-style

The full technical report accompanies the release. This is a research contribution from NVIDIA Labs, positioned as an architectural alternative to the dominant autoregressive transformer model — an active research area since the emergence of text diffusion models such as MDLM and Plaid.

🔗 @NVIDIAAI announcement

Stability AI — Stable Audio 3.0 (open-weight family)

May 20 — Stability AI releases Stable Audio 3.0, a family of 4 open-weight audio models under a commercial license. The lineup covers the full deployment spectrum, from embedded devices to enterprise API.

Model	Max length	Deployment	Open weights
Small SFX	short	on-device	Yes
Small	2 min	on-device	Yes
Medium	6:20	cloud/local	Yes
Large	6:20+	API/enterprise	No

The Small SFX, Small, and Medium models are available on Hugging Face. All training data is fully licensed, with announced partnerships with Universal Music Group and Warner Music Group. Advanced features include LoRA training support for custom fine-tuning, and an audio inpainting mode (single-segment editing, multi-segment, causal continuation).

“We want to foster the same kind of community-driven innovation in audio that we sparked in image generation with the launch of Stable Diffusion.” — Stability AI

GitHub Copilot evolves on four fronts

Adaptive auto model selection in VS Code

May 20 — Copilot’s “Auto” option in VS Code now selects the optimal model depending on the nature of the task: complex reasoning, simple code generation, debugging, or tool orchestration. The selection is based on real-time availability and reliability metrics. Practical benefit: a 10% reduction in the premium request multiplier when using Auto, with no configuration required.

🔗 GitHub changelog

Semantic natural-language issue search

May 20 — Copilot Chat on the web integrates a semantic issue index: a developer can search for “mobile rendering bugs reported last month” without knowing the exact title, and get results grouped by context. Available generally for all Copilot plans.

🔗 GitHub changelog

Removal of Gemini models from Copilot Chat web

May 20 — All Gemini models are removed from Copilot Chat on github.com, along with GPT-5.2 Codex and GPT-5.4 nano. Only OpenAI and Claude remain available on the web. GitHub justifies the choice by consistency in answer quality. Gemini remains available in IDEs and the API.

🔗 GitHub changelog

Fix with Copilot — batch application of code review feedback

May 19 — The “Implement suggestion” button is renamed “Fix with Copilot” with a new dialog (model choice, target branch, custom instructions). A new “Fix batch with Copilot” button makes it possible to group multiple code review comments and send them simultaneously to the Copilot cloud agent, reducing friction on PRs with many comments.

🔗 GitHub changelog

Claude Code v2.1.144 and v2.1.145

May 19 — Claude Code ships two versions in 24 hours with a substantial set of new features and fixes.

Version 2.1.144 improves background session management: the /resume command now shows --bg sessions, and sub-agent completion notifications include duration (e.g. “Agent completed · 3h 2m 5s”). The /model command applies only to the current session (press d to set the permanent default). The renaming from “extra usage” to “usage credits” clarifies terminology, and fixing a startup stall of up to 75 seconds when api.anthropic.com is unreachable (VPN, firewall) improves the enterprise experience.

Version 2.1.145 stands out for the introduction of claude agents --json, a command designed for integration into shell scripts (tmux-resurrect, status bars, session pickers). OpenTelemetry tracing is enhanced with agent_id and parent_agent_id in spans, enabling a correct sub-agent hierarchy. The /plugin screen now displays the full content (commands, agents, skills, hooks, MCP/LSP servers) before installation. Stop/SubagentStop hooks get two new fields: background_tasks and session_crons.

🔗 Claude Code CHANGELOG

Anthropic opens the conversation on shaping AI character

May 19 — Anthropic has published an article detailing an initiative of regular dialogues with philosophers, clergy, and ethicists drawn from more than 15 religious and cultural traditions. The goal is to enrich reflection on what it means to shape the character of an AI system — drawing on centuries of accumulated thought about virtue and the good life, without aligning Claude to any particular tradition.

One experimental result is worth noting: a tool that Claude can invoke during a task to review its own ethical commitments. Used spontaneously before high-impact actions, it showed “a marked reduction in misaligned behaviors” in internal evaluations. Next steps will include discussions with legal experts, psychologists, and civic institutions.

🔗 Anthropic article

Cohere — MOUs with Indra Group and Multiverse Computing

May 20 — Cohere signs two memoranda of understanding (MOU) during the state visit of King Felipe VI of Spain to Canada. The first brings Cohere together with IndraMind (the AI arm of the Spanish defense and digitization group Indra) to build a sovereign AI ecosystem including language adaptations for Spain’s five official languages. A defense component includes analysis and planning capabilities for multinational exercises. The second involves Multiverse Computing (quantum-inspired AI optimization, Spain/Canada) to explore business opportunities in Europe and Canada.

“Enterprises no longer want to rent AI — they want to own it.” — Aidan Gomez, co-founder and CEO of Cohere

Perplexity — Query-aware context compression in production

May 20 — Perplexity is deploying in production a query-aware context compression system that reduces context tokens by up to 70% while improving answer accuracy. The principle: a lightweight model surgically extracts the passages relevant to the query before passing them to the main LLM, eliminating ads, metadata, and off-topic content.

Metric	Value
Context token reduction	up to 70%
Vital content gain per extract	+63%
Inference latency reduction	35–40%
Aggregated GPU compute reduction	40–45%
Production latency (p99)	< 20 ms

The pplx-diffusion backbone (17 layers, distilled from 28 layers) predicts in parallel which segments to keep without generating text — an extractive approach that guarantees citation fidelity. On SimpleQA, the “medium” preset with compression reaches 95% accuracy with only 200 tokens on average per document.

🔗 Perplexity announcement

ElevenLabs — Speech Engine, a vocal agent in one prompt

May 20 — ElevenLabs launches Speech Engine, a unified voice pipeline (speech synthesis + transcription + orchestration) allowing developers to turn a text conversational agent into a full voice agent with a single prompt. Available in ElevenAPI, pricing is 8 cents per minute with volume-based discounts. Migration is possible to ElevenAgents for additional deployment channels with monitoring and analytics.

🔗 ElevenLabs announcement

Luma Agents integrates Seedance 2.0

May 19 — Luma Agents integrates Seedance 2.0, ByteDance’s video generation model, into its creative agents platform. Same workflow as the other models already integrated. This integration expands the choice of models accessible via Luma Agents, positioning the platform as a multi-model orchestration hub for AI video.

🔗 Luma announcement

Kling AI at Cannes — House of David, the first Hollywood film with AI at industrial scale

May 20 — At the 2026 Cannes Film Festival, Kling AI confirms the industrial use of its technology in House of David (Prime Video): 44 million global viewers, top 10 new series in the United States, number 1 on Prime Video US. This is the first Hollywood production to publicly acknowledge the integration of AI video generation into its large-scale production pipeline, with coherent plans meeting strict industrial standards.

🔗 Kling AI announcement

Briefs

Running Guide Agent — Google DeepMind — Personal AI agent dedicated to running training, presented as “a step toward limitless running.” 🔗 DeepMind blog
Midjourney V8.1 — flag --no reintroduced — The anti-prompting flag is back in V8.1 to exclude elements from generated images (e.g. --no people). 🔗 @midjourney announcement
Anthropic /usage revamped in Claude Code — Boris Cherny confirms an overhaul of the /usage UI to better visualize token consumption in response to a user. 🔗 source
MiniMax Speech 2.8 Turbo — 600+ voices on Together AI — More than 600 new Speech 2.8 Turbo voices are now available on the Together AI platform. 🔗 @MiniMax_AI announcement

What this means

Fundamental research and autonomous AI. The resolution of Erdős’ conjecture by a general-purpose OpenAI model is not anecdotal. What strikes the mathematicians involved is the nature of the result: an unexpected connection between two branches of mathematics (algebraic number theory and discrete geometry), sustained over 125 pages of coherent reasoning. Coupled with Gemini for Science (developed with 100+ institutions), the trend is clear: AI is beginning to be integrated not just as a scientific data-processing tool, but as a discovery partner capable of generating original hypotheses.

Alternative architectures to the autoregressive paradigm. Two announcements today challenge the dominant GPT-style model. NVIDIA Nemotron-Labs-Diffusion generates tokens in parallel through diffusion rather than sequentially. Stability AI’s Stable Audio 3.0 demonstrates that diffusion produces high-quality musical results with open weights models across 4 deployment tiers. The convergence of these approaches suggests that diffusion is no longer confined to image generation — it is becoming a serious competing architecture for text and audio.

Sovereignty and enterprise AI. Command A+ (218B open-source MoE, Apache 2.0, 2× H100) and Cohere’s MOUs with Indra Group and Multiverse Computing illustrate a broader trend: large organizations — governments, defense, regulated sectors — want to deploy their models within their own infrastructure. The combination of an efficient MoE architecture (25B active out of 218B total) and an Apache 2.0 license makes Command A+ the best-positioned open-source model for sovereign deployments as of late May 2026.

Growing pressure on developer tooling. Claude Code 2.1.144 and 2.1.145, GitHub Copilot’s four simultaneous updates, and Perplexity’s context compression (-70% tokens, -40% GPU) are consistent signals: the competition is shifting from raw model quality toward tool ergonomics, scriptability (claude agents —json), inference cost (Auto model selection -10%, pplx-diffusion), and production robustness (fixing the VPN blockage in Claude Code).

OpenAI Refutes an 80-Year-Old Erdős Conjecture, Cohere Command A+ Open Source, NVIDIA Nemotron-Labs-Diffusion

OpenAI refutes an 80-year-old Erdős conjecture

Cohere Command A+ — open-source MoE flagship

Gemini for Science — AI as a partner in scientific discovery

NVIDIA Nemotron-Labs-Diffusion — Token-diffusion architecture

Stability AI — Stable Audio 3.0 (open-weight family)

GitHub Copilot evolves on four fronts

Adaptive auto model selection in VS Code

Semantic natural-language issue search

Removal of Gemini models from Copilot Chat web

Fix with Copilot — batch application of code review feedback

Claude Code v2.1.144 and v2.1.145

Anthropic opens the conversation on shaping AI character

Cohere — MOUs with Indra Group and Multiverse Computing

Perplexity — Query-aware context compression in production

ElevenLabs — Speech Engine, a vocal agent in one prompt

Luma Agents integrates Seedance 2.0

Kling AI at Cannes — House of David, the first Hollywood film with AI at industrial scale

Briefs

What this means

Sources

Table of Contents