Mistral Medium 3.5 and Vibe Remote Agents, Google 8th-Generation TPU, Claude for Creative Work

This week marks an acceleration on three simultaneous fronts: open models (Mistral Medium 3.5, NVIDIA Nemotron 3 Nano Omni), hardware infrastructure (Google 8th-generation TPU), and agent ecosystems (Vibe Remote Agents, Claude for Creative Work, GitHub Copilot). ElevenLabs also reaches a milestone by turning its AI music engine into a consumer platform with monetization.

Mistral Medium 3.5, Vibe Remote Agents, and Le Chat Work Mode

April 29 — Mistral AI is publishing three major announcements simultaneously: the Mistral Medium 3.5 model, Vibe remote agents (remote agents), and Work Mode (Work Mode) in Le Chat.

Mistral Medium 3.5 in public preview

Medium 3.5 is a dense 128-billion-parameter model that unifies instruction following, reasoning, and code in a single set of weights, with a 256,000-token context window. It can run locally on only four GPUs.

Feature	Value
Architecture	Dense 128B
Context	256,000 tokens
SWE-Bench Verified	77.6%
τ³-Telecom	91.4
License	Modified MIT (open weights)
API — input tokens	$1.50 / million
API — output tokens	$7.50 / million
Self-hosting (min. GPU)	4 GPUs

The model surpasses Devstral 2 and Qwen3.5-397B-A17B on SWE-Bench Verified, positioning it as a benchmark among open-weights coding models at launch. It is available through the Mistral API, Le Chat, Vibe, NVIDIA endpoints (build.nvidia.com), and the NVIDIA NIM microservice.

Remote agents (remote agents) in Vibe

Vibe coding sessions can now run in the cloud, without staying open locally. Multiple sessions run in parallel while the developer does something else. A local session can be “teleported” to the cloud with its full history and state. At the end of the task, the agent automatically opens a pull request on GitHub and notifies the developer.

Vibe integrates natively with GitHub (code + PR), Linear and Jira (tickets), Sentry (incidents), Slack, and Teams (notifications). Each session runs in an isolated session.

Work Mode (Work Mode) in Le Chat (preview)

A new agentic mode for complex tasks in Le Chat: multi-source research, document synthesis, inbox sorting, Jira ticket creation, sending summaries to Slack. Connectors are enabled by default in Work Mode. Every visible action requires explicit approval for sensitive operations.

🔗 Mistral announcement on X

Google 8th-Generation TPU — TPU 8t and TPU 8i

April 29 — Google unveils its eighth generation of TPU (Tensor Processing Unit) chips, announced at Google Cloud Next ‘26 the previous week. Two distinct chips make up this generation, each optimized for a different phase of the AI cycle.

A decade in the making, the chips for the agentic era have arrived. At @GoogleCloud’s Next ‘26 event last week, we unveiled our eighth-generation TPUs. TPU 8t: 3x more powerful than previous gen, 10x faster data movement, 97% productive resource utilization, training time from months to weeks. TPU 8i: tripled internal memory, 80% better perf/dollar, 5x latency reduction. — @GoogleAI on X

TPU 8t — model training

Improvement	Detail
Raw power	3× higher than the previous generation
Data throughput	10× faster (storage → chips)
Productive utilization	97% of resources (automatic failure detection and rerouting)
Impact	Training time reduced from several months to a few weeks

TPU 8i — inference for AI agents

Improvement	Detail
Internal memory	Tripled to handle complex multi-step reasoning
Cost efficiency	+80% performance per dollar spent
Latency	Reduced by 5× thanks to a new integrated engine

These chips are designed for the agentic era: TPU 8t accelerates model creation, while TPU 8i allows these models to act (book a flight, manage a calendar) in near real time. Google positions this dual architecture as the technological foundation of the next decade.

Claude for Creative Work — Blender, Autodesk Fusion, Adobe, and 5 other MCP connectors

April 28 — Anthropic launches a series of official MCP (Model Context Protocol) connectors for professionals in the creative industries, in partnership with Blender, Autodesk, Adobe, Ableton, and Splice.

Tool	Use
Blender	Debugging 3D scenes, creating tools, batch edits across all objects
Autodesk Fusion	Creating and modifying 3D models with natural language
Adobe Creative Cloud	Realizing images, videos, and designs via 50+ CC tools
Ableton Live and Push	Exploring official product documentation
Splice	Searching for royalty-free samples directly from Claude
Canva Affinity	Automating repetitive production tasks
SketchUp	Starting point for 3D modeling from text description
Resolume / Touchdesigner	Real-time natural language control for VJs and visual artists

“Claude now connects to the tools creative professionals already use. With the new Blender connector, you can debug a scene, build new tools, or batch-apply changes across every object, directly from Claude.” — @claudeai on X

Anthropic has also joined the Blender Development Fund as a patron donor, supporting the development of open-source software. The main tweet generated more than 10 million views in less than 24 hours (the Autodesk Fusion tweet reaching 11 million), making it one of Anthropic’s most viral announcements in several months.

Highlighted use cases: learning complex software, extending tools with code (scripts, plugins, generative systems via Claude Code), bridging tools in a pipeline, automating repetitive tasks (batch processing, scaffolding).

🔗 Anthropic article

NVIDIA Nemotron 3 Nano Omni — 30B open-source omnimodal model

April 28 — NVIDIA launches Nemotron 3 Nano Omni, an open-source omnimodal model that unifies vision, audio, and language in a single architecture.

Parameter	Value
Architecture	Hybrid MoE 30B-A3B (30B total, 3B active)
Context	256K tokens
Modalities (input)	Text, images, audio, video, documents, charts, interfaces
Modalities (output)	Text
Efficiency	9× higher throughput than other open omnimodal models
Availability	Hugging Face, OpenRouter, build.nvidia.com, 25+ partner platforms

The model excels in three use cases: computer use (native-resolution 1920×1080 graphical interface navigation), document intelligence (interpreting PDFs, tables, charts, screenshots), and maintaining audio-video context within a single reasoning stream.

Organizations such as Aible, H Company, Palantir, Foxconn, and Oracle are evaluating the model at launch. H Company is integrating it into its computer use agent.

“To build useful agents, you can’t wait seconds for a model to interpret a screen. By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.” — Gautier Cloix, CEO of H Company

The Nemotron family has also reached 50 million cumulative downloads across all Nano/Super/Ultra variants in one year.

🔗 NVIDIA blog

ElevenMusic — AI music platform (discovery, remixing, creation, monetization)

April 29 — ElevenLabs launches ElevenMusic, an AI music platform that connects listening, remixing, and original creation in a single system, with direct monetization for artists.

Feature	Description
Discovery	4,000+ independent artists, curated catalog
Remixing	Change genre, tempo, reinterpret a track
Creation	From lyrics, melody, or mood
Publishing	Distribution + monetization via fan engagement

The business model is inspired by ElevenLabs’ Voice Library, which has already paid out $11 million to its creators. Artists publish and earn based on listener engagement, without an intermediary label.

ElevenMusic launches with Eleven Album Vol. 2, a compilation featuring Danger Twins and Justin Love, designed to be experienced and remixed within the platform. Kevin Jonas Sr. (Jonas Group Entertainment) and Amy Stroup (Danger Twins) are among the artistic partners at launch.

“Fans want to feel like they’re part of the music, the songwriters, and the artists. ElevenMusic gives them a way in, turning a song into something people can step into, not just listen to.” — Kevin Jonas Sr., Founder and President of Jonas Group Entertainment

The platform is available on mobile app and web starting April 29, 2026.

🔗 @ElevenLabs announcement on X — 🔗 ElevenLabs blog

GitHub Copilot code review — double billing starting June 1, 2026

April 27 — GitHub announces that starting June 1, 2026, each automated code review by GitHub Copilot will consume GitHub Actions minutes in addition to the AI credits already provided by the new usage-based model.

Until now, Copilot code reviews consumed only premium request units (premium request units, PRU). Starting June 1, two counters will be activated simultaneously for private repositories:

Counter	Detail
AI Credits	Any Copilot usage (including code review) billed in AI credits, in accordance with the usage-based model
GitHub Actions Minutes	Consumed from the plan allowance for each review on a private repository; additional minutes billed at standard Actions rates

This double counting is explained by the agentic architecture of Copilot code review: the tool relies on GitHub-hosted runners to analyze the broader repository context and produce more relevant feedback.

Affected plans: Copilot Pro, Pro+, Business, Enterprise — including reviews initiated by unlicensed users through direct billing to the organization.

Public repositories: no change, Actions minutes remain free.

To prepare before June 1:

Check current Actions usage in billing settings
Adjust Actions spending limits if necessary
Inform the organization’s billing managers

🔗 GitHub changelog

OpenAI DevDay 2026 — San Francisco, September 29

April 29 — OpenAI has announced the return of its annual developer event: OpenAI DevDay 2026 will take place on September 29 in San Francisco. Official registration has not opened yet.

To encourage early momentum, OpenAI is launching a contest: developers who build something with GPT-5.5 and image generation can try to win early access invitations. The process: submit a link to the project along with a note explaining how it was built, using the official hashtag #OpenAIDevDay2026.

Detail	Value
Date	September 29, 2026
Location	San Francisco
Official hashtag	#OpenAIDevDay2026
Tweet views (first hours)	239,000+

The announcement was published five months in advance, which is unusually early for a DevDay. Previous editions have served as the stage for OpenAI’s most significant product launches for the developer community: in 2023, GPT-4 Turbo and the Assistants API were introduced there. With the current acceleration in release cadence — GPT-5.5, image generation, Codex CLI — DevDay 2026 is shaping up to be an important milestone on the calendar for technical teams integrating OpenAI models into production.

A separate thread invites developers to share their creations right away. The @OpenAIDevs account relayed the announcement within minutes of the main post.

🔗 OpenAI announcement on X

Agent ecosystem and new integrations

Claude Code CLI v2.1.120–2.1.123 — 50+ fixes

April 28 — The Claude Code team detailed the fixes delivered across the last four CLI versions (v2.1.120 to v2.1.123): more than 50 stability and performance improvements.

Metric	Value
Versions affected	v2.1.120, v2.1.121, v2.1.122, v2.1.123
Number of fixes	50+
Performance gain `/resume`	Up to 67% faster
@ClaudeDevs thread views	493k

The five focus areas: faster long sessions (/resume up to 67% faster), stabilized macOS authentication (a dozen keychain fixes), reduced memory usage on Linux, WebFetch without freezing on large pages, and copy-paste preserving line breaks from Windows and Xcode.

🔗 @ClaudeDevs thread

OpenAI × AWS — Codex and Managed Agents on Amazon Bedrock

April 28 — OpenAI and AWS are expanding their strategic partnership across three areas: access to OpenAI models in AWS environments, Codex on Bedrock (limited preview, for organizations that want to keep their data within Amazon infrastructure), and Bedrock Managed Agents powered by OpenAI (available immediately). Codex has more than 4 million weekly users.

🔗 OpenAI announcement

Copilot cloud agent starts 20% faster

April 27 — GitHub Copilot cloud agent now starts more than 20% faster thanks to preconfigured runner environments via custom GitHub Actions images. This improvement comes on top of the 50% reduction already delivered in March 2026.

🔗 GitHub changelog

Gemini — downloadable file generation

April 29 — Gemini can now create downloadable files directly from chat: PDF, Word (.docx), Excel (.xlsx), Google Docs/Sheets/Slides, CSV, LaTeX, RTF, and Markdown. Available immediately for all web and mobile users.

🔗 Google blog

Mistral Workflows in public preview

April 27 — Mistral AI is launching Workflows in public preview, an enterprise orchestration layer built on Temporal’s durable execution engine (durable execution engine) (the same infrastructure used by Netflix, Stripe, Salesforce). Flows are written in Python via the Mistral SDK v3.0, then triggered from Le Chat by business teams. ASML, France Travail, and La Banque Postale are already using it.

🔗 Mistral announcement

Qwen FlashQLA — linear attention kernels

April 29 — Qwen has released FlashQLA, a high-performance linear attention kernel (kernels) library built on TileLang, designed for agentic AI on personal devices: 2–3× gains in the forward pass (forward) and 2× in the backward pass (backward). Released as open source on GitHub.

🔗 QwenLM/FlashQLA GitHub

GPT Image 2 integrated into Manus Slides

April 29 — Manus has integrated GPT Image 2 into Manus Slides: point-and-click visual editing, prompt-based replacement, presentation note generation, export to Google Slides, PowerPoint, PDF, Google Drive, and OneDrive.

🔗 Manus announcement

Salesforce connected to Genspark

April 29 — Genspark has integrated Salesforce into its agent ecosystem: connection via Genspark Claw (CLI installation by instruction) or Super Agent (direct connection). Use cases include automatic customer request handling, quarterly dashboards, and automated sales pipeline management.

🔗 Genspark announcement

GPT-5.5 and ChatGPT Images 2.0 on Genspark

April 28 — Genspark has integrated GPT-5.5 into its AI chat and ChatGPT Images 2.0 (GPT Image 2) into its image generator, available respectively at genspark.ai/agents and genspark.ai/ai_image.

🔗 Genspark announcement

Pika Agents — creative conversational interface

April 28 — Pika has launched Pika Agents: a video creation interface that replaces the prompt box with a personalized AI agent (voice, face, personality configured by the user). The agent understands creative intent in natural language and assembles, refines, and produces in a single conversation.

🔗 Pika announcement

Codex seats at $0 for ChatGPT Business through the end of June

April 29 — OpenAI is allowing eligible ChatGPT Business subscribers to add Codex seats with no seat cost through the end of June 2026, accompanying Codex’s expansion on AWS.

🔗 @OpenAIDevs announcement

60-year-old Erdős problem solved with GPT-5.5

April 28 — OpenAI published a podcast episode in which Sébastien Bubeck and Ernest Ryu revisit the solution to a mathematical problem that had remained open for 60 years, attributed to Paul Erdős, with the help of GPT-5.5. The tweet exceeded 399,000 views.

🔗 OpenAI tweet

Briefs

DeepSeek-V4-Pro: -75% promo extended — The 75% discount on the DeepSeek-V4-Pro API has been extended through May 31, 2026. Promotional pricing: $0.003625/M input tokens (cache hit), $0.435 (cache miss), $0.87 output. 🔗 DeepSeek tweet
Google DeepMind — Experience AI in Latin America — The educational program Experience AI (Raspberry Pi Foundation) is expanding in Latin America with a goal of training 24,000 teachers and reaching 1.25 million students by 2028, funded with $4.6 million from Google.org. 🔗 Google DeepMind tweet
GPT-5.3-Codex removed from Copilot Student model picker — Effective April 27, 2026, GPT-5.3-Codex can no longer be selected manually in the Copilot Student plan; it remains accessible through automatic selection. 🔗 GitHub changelog
Responses API — blocked domains for web search — OpenAI’s Responses API now makes it possible to block specific domains while keeping web search enabled, in order to exclude specific sources from results. 🔗 @charlierguo tweet
OpenAI — community safety commitment — OpenAI published an article detailing its safety practices in ChatGPT: model-level risk mitigation, automated monitoring, connecting users with support resources, and reporting to authorities in serious cases. A transparency publication with no new feature. 🔗 OpenAI announcement

What it means

The race for open models is intensifying. Mistral Medium 3.5 (128B, SWE-Bench 77.6%) and NVIDIA Nemotron 3 Nano Omni (30B, 9× more efficient than other open omnimodal models) are arriving simultaneously with permissive licenses. Both models position themselves as credible alternatives to closed frontiers: Mistral on code and reasoning, Nemotron on agentic multimodality. This pressure keeps the gap between proprietary models and open weights increasingly narrow.

Hardware infrastructure remains the strategic bottleneck. Google’s 8th-generation TPUs (3× in training, 5× lower inference latency) illustrate that the AI race is also being fought at the silicon level. The Google Cloud Next ‘26 announcement positions Google infrastructure as a durable competitive advantage against NVIDIA GPUs — even if both coexist in real deployments.

The agentic ecosystem is fragmenting into vertical specializations. This week, AI agents are embedding themselves in creative tools (Claude for Creative Work with 8+ MCP connectors), software development (Vibe Remote Agents, Copilot cloud agent 20% faster), music (ElevenMusic), video (Pika Agents), CRMs (Salesforce in Genspark), and enterprise workflows (Mistral Workflows). The question is no longer “can AI do this?” but “in which specialized tool and under what billing model?”

Usage-based billing is transforming developers’ business models. GitHub Copilot code review’s shift to double charging (AI credits + Actions minutes) starting June 1, combined with the Codex seats offer at $0 for ChatGPT Business, illustrates a broader dynamic: publishers subsidize adoption (temporary free access, DeepSeek’s -75% promo) to create habits before normalizing usage-based billing. Technical teams would benefit from auditing their AI spending categories before June.

Sources

This document was translated from the fr version into the en language using the gpt-5.4 model. For more information about the translation process, see https://gitlab.com/jls42/ai-powered-markdown-translator