This week marks an acceleration on three simultaneous fronts: open models (Mistral Medium 3.5, NVIDIA Nemotron 3 Nano Omni), hardware infrastructure (Google 8th-generation TPU), and agent ecosystems (Vibe Remote Agents, Claude for Creative Work, GitHub Copilot). ElevenLabs also reaches a milestone by turning its AI music engine into a consumer platform with monetization.
Mistral Medium 3.5, Vibe Remote Agents, and Le Chat Work Mode
April 29 — Mistral AI is publishing three major announcements simultaneously: the Mistral Medium 3.5 model, Vibe remote agents (remote agents), and Work Mode (Work Mode) in Le Chat.
Mistral Medium 3.5 in public preview
Medium 3.5 is a dense 128-billion-parameter model that unifies instruction following, reasoning, and code in a single set of weights, with a 256,000-token context window. It can run locally on only four GPUs.
| Feature | Value |
|---|---|
| Architecture | Dense 128B |
| Context | 256,000 tokens |
| SWE-Bench Verified | 77.6% |
| τ³-Telecom | 91.4 |
| License | Modified MIT (open weights) |
| API — input tokens | $1.50 / million |
| API — output tokens | $7.50 / million |
| Self-hosting (min. GPU) | 4 GPUs |
The model surpasses Devstral 2 and Qwen3.5-397B-A17B on SWE-Bench Verified, positioning it as a benchmark among open-weights coding models at launch. It is available through the Mistral API, Le Chat, Vibe, NVIDIA endpoints (build.nvidia.com), and the NVIDIA NIM microservice.
Remote agents (remote agents) in Vibe
Vibe coding sessions can now run in the cloud, without staying open locally. Multiple sessions run in parallel while the developer does something else. A local session can be “teleported” to the cloud with its full history and state. At the end of the task, the agent automatically opens a pull request on GitHub and notifies the developer.
Vibe integrates natively with GitHub (code + PR), Linear and Jira (tickets), Sentry (incidents), Slack, and Teams (notifications). Each session runs in an isolated session.
Work Mode (Work Mode) in Le Chat (preview)
A new agentic mode for complex tasks in Le Chat: multi-source research, document synthesis, inbox sorting, Jira ticket creation, sending summaries to Slack. Connectors are enabled by default in Work Mode. Every visible action requires explicit approval for sensitive operations.
Google 8th-Generation TPU — TPU 8t and TPU 8i
April 29 — Google unveils its eighth generation of TPU (Tensor Processing Unit) chips, announced at Google Cloud Next ‘26 the previous week. Two distinct chips make up this generation, each optimized for a different phase of the AI cycle.
A decade in the making, the chips for the agentic era have arrived. At @GoogleCloud’s Next ‘26 event last week, we unveiled our eighth-generation TPUs. TPU 8t: 3x more powerful than previous gen, 10x faster data movement, 97% productive resource utilization, training time from months to weeks. TPU 8i: tripled internal memory, 80% better perf/dollar, 5x latency reduction. — @GoogleAI on X
TPU 8t — model training
| Improvement | Detail |
|---|---|
| Raw power | 3× higher than the previous generation |
| Data throughput | 10× faster (storage → chips) |
| Productive utilization | 97% of resources (automatic failure detection and rerouting) |
| Impact | Training time reduced from several months to a few weeks |
TPU 8i — inference for AI agents
| Improvement | Detail |
|---|---|
| Internal memory | Tripled to handle complex multi-step reasoning |
| Cost efficiency | +80% performance per dollar spent |
| Latency | Reduced by 5× thanks to a new integrated engine |
These chips are designed for the agentic era: TPU 8t accelerates model creation, while TPU 8i allows these models to act (book a flight, manage a calendar) in near real time. Google positions this dual architecture as the technological foundation of the next decade.
Claude for Creative Work — Blender, Autodesk Fusion, Adobe, and 5 other MCP connectors
April 28 — Anthropic launches a series of official MCP (Model Context Protocol) connectors for professionals in the creative industries, in partnership with Blender, Autodesk, Adobe, Ableton, and Splice.
| Tool | Use |
|---|---|
| Blender | Debugging 3D scenes, creating tools, batch edits across all objects |
| Autodesk Fusion | Creating and modifying 3D models with natural language |
| Adobe Creative Cloud | Realizing images, videos, and designs via 50+ CC tools |
| Ableton Live and Push | Exploring official product documentation |
| Splice | Searching for royalty-free samples directly from Claude |
| Canva Affinity | Automating repetitive production tasks |
| SketchUp | Starting point for 3D modeling from text description |
| Resolume / Touchdesigner | Real-time natural language control for VJs and visual artists |
“Claude now connects to the tools creative professionals already use. With the new Blender connector, you can debug a scene, build new tools, or batch-apply changes across every object, directly from Claude.” — @claudeai on X
Anthropic has also joined the Blender Development Fund as a patron donor, supporting the development of open-source software. The main tweet generated more than 10 million views in less than 24 hours (the Autodesk Fusion tweet reaching 11 million), making it one of Anthropic’s most viral announcements in several months.
Highlighted use cases: learning complex software, extending tools with code (scripts, plugins, generative systems via Claude Code), bridging tools in a pipeline, automating repetitive tasks (batch processing, scaffolding).
NVIDIA Nemotron 3 Nano Omni — 30B open-source omnimodal model
April 28 — NVIDIA launches Nemotron 3 Nano Omni, an open-source omnimodal model that unifies vision, audio, and language in a single architecture.
| Parameter | Value |
|---|---|
| Architecture | Hybrid MoE 30B-A3B (30B total, 3B active) |
| Context | 256K tokens |
| Modalities (input) | Text, images, audio, video, documents, charts, interfaces |
| Modalities (output) | Text |
| Efficiency | 9× higher throughput than other open omnimodal models |
| Availability | Hugging Face, OpenRouter, build.nvidia.com, 25+ partner platforms |
The model excels in three use cases: computer use (native-resolution 1920×1080 graphical interface navigation), document intelligence (interpreting PDFs, tables, charts, screenshots), and maintaining audio-video context within a single reasoning stream.
Organizations such as Aible, H Company, Palantir, Foxconn, and Oracle are evaluating the model at launch. H Company is integrating it into its computer use agent.
“To build useful agents, you can’t wait seconds for a model to interpret a screen. By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.” — Gautier Cloix, CEO of H Company
The Nemotron family has also reached 50 million cumulative downloads across all Nano/Super/Ultra variants in one year.
ElevenMusic — AI music platform (discovery, remixing, creation, monetization)
April 29 — ElevenLabs launches ElevenMusic, an AI music platform that connects listening, remixing, and original creation in a single system, with direct monetization for artists.
| Feature | Description |
|---|---|
| Discovery | 4,000+ independent artists, curated catalog |
| Remixing | Change genre, tempo, reinterpret a track |
| Creation | From lyrics, melody, or mood |
| Publishing | Distribution + monetization via fan engagement |
The business model is inspired by ElevenLabs’ Voice Library, which has already paid out $11 million to its creators. Artists publish and earn based on listener engagement, without an intermediary label.
ElevenMusic launches with Eleven Album Vol. 2, a compilation featuring Danger Twins and Justin Love, designed to be experienced and remixed within the platform. Kevin Jonas Sr. (Jonas Group Entertainment) and Amy Stroup (Danger Twins) are among the artistic partners at launch.
“Fans want to feel like they’re part of the music, the songwriters, and the artists. ElevenMusic gives them a way in, turning a song into something people can step into, not just listen to.” — Kevin Jonas Sr., Founder and President of Jonas Group Entertainment
The platform is available on mobile app and web starting April 29, 2026.
🔗 @ElevenLabs announcement on X — 🔗 ElevenLabs blog
GitHub Copilot code review — double billing starting June 1, 2026
April 27 — GitHub announces that starting June 1, 2026, each automated code review by GitHub Copilot will consume GitHub Actions minutes in addition to the AI credits already provided by the new usage-based model.
Until now, Copilot code reviews consumed only premium request units (premium request units, PRU). Starting June 1, two counters will be activated simultaneously for private repositories:
| Counter | Detail |
|---|---|
| AI Credits | Any Copilot usage (including code review) billed in AI credits, in accordance with the usage-based model |
| GitHub Actions Minutes | Consumed from the plan allowance for each review on a private repository; additional minutes billed at standard Actions rates |
This double counting is explained by the agentic architecture of Copilot code review: the tool relies on GitHub-hosted runners to analyze the broader repository context and produce more relevant feedback.
Affected plans: Copilot Pro, Pro+, Business, Enterprise — including reviews initiated by unlicensed users through direct billing to the organization.
Public repositories: no change, Actions minutes remain free.
To prepare before June 1:
- Check current Actions usage in billing settings
- Adjust Actions spending limits if necessary
- Inform the organization’s billing managers
OpenAI DevDay 2026 — San Francisco, September 29
April 29 — OpenAI has announced the return of its annual developer event: OpenAI DevDay 2026 will take place on September 29 in San Francisco. Official registration has not opened yet.
To encourage early momentum, OpenAI is launching a contest: developers who build something with GPT-5.5 and image generation can try to win early access invitations. The process: submit a link to the project along with a note explaining how it was built, using the official hashtag #OpenAIDevDay2026.
| Detail | Value |
|---|---|
| Date | September 29, 2026 |
| Location | San Francisco |
| Official hashtag | #OpenAIDevDay2026 |
| Tweet views (first hours) | 239,000+ |
The announcement was published five months in advance, which is unusually early for a DevDay. Previous editions have served as the stage for OpenAI’s most significant product launches for the developer community: in 2023, GPT-4 Turbo and the Assistants API were introduced there. With the current acceleration in release cadence — GPT-5.5, image generation, Codex CLI — DevDay 2026 is shaping up to be an important milestone on the calendar for technical teams integrating OpenAI models into production.
A separate thread invites developers to share their creations right away. The @OpenAIDevs account relayed the announcement within minutes of the main post.
Agent ecosystem and new integrations
Claude Code CLI v2.1.120–2.1.123 — 50+ fixes
April 28 — The Claude Code team detailed the fixes delivered across the last four CLI versions (v2.1.120 to v2.1.123): more than 50 stability and performance improvements.
| Metric | Value |
|---|---|
| Versions affected | v2.1.120, v2.1.121, v2.1.122, v2.1.123 |
| Number of fixes | 50+ |
Performance gain /resume | Up to 67% faster |
| @ClaudeDevs thread views | 493k |
The five focus areas: faster long sessions (/resume up to 67% faster), stabilized macOS authentication (a dozen keychain fixes), reduced memory usage on Linux, WebFetch without freezing on large pages, and copy-paste preserving line breaks from Windows and Xcode.
OpenAI × AWS — Codex and Managed Agents on Amazon Bedrock
April 28 — OpenAI and AWS are expanding their strategic partnership across three areas: access to OpenAI models in AWS environments, Codex on Bedrock (limited preview, for organizations that want to keep their data within Amazon infrastructure), and Bedrock Managed Agents powered by OpenAI (available immediately). Codex has more than 4 million weekly users.
Copilot cloud agent starts 20% faster
April 27 — GitHub Copilot cloud agent now starts more than 20% faster thanks to preconfigured runner environments via custom GitHub Actions images. This improvement comes on top of the 50% reduction already delivered in March 2026.
Gemini — downloadable file generation
April 29 — Gemini can now create downloadable files directly from chat: PDF, Word (.docx), Excel (.xlsx), Google Docs/Sheets/Slides, CSV, LaTeX, RTF, and Markdown. Available immediately for all web and mobile users.
Mistral Workflows in public preview
April 27 — Mistral AI is launching Workflows in public preview, an enterprise orchestration layer built on Temporal’s durable execution engine (durable execution engine) (the same infrastructure used by Netflix, Stripe, Salesforce). Flows are written in Python via the Mistral SDK v3.0, then triggered from Le Chat by business teams. ASML, France Travail, and La Banque Postale are already using it.
Qwen FlashQLA — linear attention kernels
April 29 — Qwen has released FlashQLA, a high-performance linear attention kernel (kernels) library built on TileLang, designed for agentic AI on personal devices: 2–3× gains in the forward pass (forward) and 2× in the backward pass (backward). Released as open source on GitHub.
GPT Image 2 integrated into Manus Slides
April 29 — Manus has integrated GPT Image 2 into Manus Slides: point-and-click visual editing, prompt-based replacement, presentation note generation, export to Google Slides, PowerPoint, PDF, Google Drive, and OneDrive.
Salesforce connected to Genspark
April 29 — Genspark has integrated Salesforce into its agent ecosystem: connection via Genspark Claw (CLI installation by instruction) or Super Agent (direct connection). Use cases include automatic customer request handling, quarterly dashboards, and automated sales pipeline management.
GPT-5.5 and ChatGPT Images 2.0 on Genspark
April 28 — Genspark has integrated GPT-5.5 into its AI chat and ChatGPT Images 2.0 (GPT Image 2) into its image generator, available respectively at genspark.ai/agents and genspark.ai/ai_image.
Pika Agents — creative conversational interface
April 28 — Pika has launched Pika Agents: a video creation interface that replaces the prompt box with a personalized AI agent (voice, face, personality configured by the user). The agent understands creative intent in natural language and assembles, refines, and produces in a single conversation.
Codex seats at $0 for ChatGPT Business through the end of June
April 29 — OpenAI is allowing eligible ChatGPT Business subscribers to add Codex seats with no seat cost through the end of June 2026, accompanying Codex’s expansion on AWS.
60-year-old Erdős problem solved with GPT-5.5
April 28 — OpenAI published a podcast episode in which Sébastien Bubeck and Ernest Ryu revisit the solution to a mathematical problem that had remained open for 60 years, attributed to Paul Erdős, with the help of GPT-5.5. The tweet exceeded 399,000 views.
Briefs
-
DeepSeek-V4-Pro: -75% promo extended — The 75% discount on the DeepSeek-V4-Pro API has been extended through May 31, 2026. Promotional pricing: $0.003625/M input tokens (cache hit), $0.435 (cache miss), $0.87 output. 🔗 DeepSeek tweet
-
Google DeepMind — Experience AI in Latin America — The educational program Experience AI (Raspberry Pi Foundation) is expanding in Latin America with a goal of training 24,000 teachers and reaching 1.25 million students by 2028, funded with $4.6 million from Google.org. 🔗 Google DeepMind tweet
-
GPT-5.3-Codex removed from Copilot Student model picker — Effective April 27, 2026, GPT-5.3-Codex can no longer be selected manually in the Copilot Student plan; it remains accessible through automatic selection. 🔗 GitHub changelog
-
Responses API — blocked domains for web search — OpenAI’s Responses API now makes it possible to block specific domains while keeping web search enabled, in order to exclude specific sources from results. 🔗 @charlierguo tweet
-
OpenAI — community safety commitment — OpenAI published an article detailing its safety practices in ChatGPT: model-level risk mitigation, automated monitoring, connecting users with support resources, and reporting to authorities in serious cases. A transparency publication with no new feature. 🔗 OpenAI announcement
What it means
The race for open models is intensifying. Mistral Medium 3.5 (128B, SWE-Bench 77.6%) and NVIDIA Nemotron 3 Nano Omni (30B, 9× more efficient than other open omnimodal models) are arriving simultaneously with permissive licenses. Both models position themselves as credible alternatives to closed frontiers: Mistral on code and reasoning, Nemotron on agentic multimodality. This pressure keeps the gap between proprietary models and open weights increasingly narrow.
Hardware infrastructure remains the strategic bottleneck. Google’s 8th-generation TPUs (3× in training, 5× lower inference latency) illustrate that the AI race is also being fought at the silicon level. The Google Cloud Next ‘26 announcement positions Google infrastructure as a durable competitive advantage against NVIDIA GPUs — even if both coexist in real deployments.
The agentic ecosystem is fragmenting into vertical specializations. This week, AI agents are embedding themselves in creative tools (Claude for Creative Work with 8+ MCP connectors), software development (Vibe Remote Agents, Copilot cloud agent 20% faster), music (ElevenMusic), video (Pika Agents), CRMs (Salesforce in Genspark), and enterprise workflows (Mistral Workflows). The question is no longer “can AI do this?” but “in which specialized tool and under what billing model?”
Usage-based billing is transforming developers’ business models. GitHub Copilot code review’s shift to double charging (AI credits + Actions minutes) starting June 1, combined with the Codex seats offer at $0 for ChatGPT Business, illustrates a broader dynamic: publishers subsidize adoption (temporary free access, DeepSeek’s -75% promo) to create habits before normalizing usage-based billing. Technical teams would benefit from auditing their AI spending categories before June.
Sources
- Mistral Medium 3.5 + Vibe Remote Agents
- @mistralvibe announcement on X
- Mistral Workflows
- Google 8th-generation TPU — @GoogleAI on X
- Claude for Creative Work — Anthropic
- NVIDIA Nemotron 3 Nano Omni — NVIDIA Blog
- ElevenMusic — ElevenLabs Blog
- GitHub Copilot code review → Actions minutes
- Copilot cloud agent 20% faster
- OpenAI DevDay 2026 — @OpenAI on X
- OpenAI × AWS
- Codex seats $0 — @OpenAIDevs on X
- Erdős problem — @OpenAI on X
- Gemini file generation — Google Blog
- Qwen FlashQLA — GitHub
- GPT Image 2 in Manus Slides
- Salesforce in Genspark
- GPT-5.5 and ChatGPT Images 2.0 on Genspark
- Pika Agents
- Claude Code CLI v2.1.123 — @ClaudeDevs on X
- DeepSeek-V4-Pro promo extended
- Google DeepMind Experience AI Latin America
- GPT-5.3-Codex removed from Copilot Student
- Responses API blocked domains
- OpenAI community safety commitment
This document was translated from the fr version into the en language using the gpt-5.4 model. For more information about the translation process, see https://gitlab.com/jls42/ai-powered-markdown-translator