Gemma 4 open source, Qwen3.6-Plus leads agentic coding, Anthropic explores functional emotions in LLMs

April 2, 2026 brings several major announcements: Google releases Gemma 4 under the Apache 2.0 license with four sizes and native multimodal capabilities, Alibaba launches Qwen3.6-Plus which tops Terminal-Bench 2.0 with a one-million-token context window, and Anthropic publishes foundational research on internal emotion structures in large language models. On the tooling side, Codex moves to pay-as-you-go pricing, GitHub and Linear plugins join its ecosystem, and Perplexity ships an extension specialized in U.S. tax matters.

Gemma 4: Google’s most capable open model family

April 2, 2026 — Google DeepMind announces Gemma 4, its new family of open models, released under the Apache 2.0 license. Described as the most capable generation since Gemma 1, the family comes in four sizes suited to needs ranging from embedded mobile to cloud.

Model	Type	Target use	Hardware
E2B (Effective 2B)	Edge multimodal	Mobile, IoT, Raspberry Pi	Android, Jetson Orin Nano
E4B (Effective 4B)	Edge multimodal + audio	High-end mobile	Android, iOS
26B MoE (Mixture of Experts)	Desktop/laptop reasoning	Consumer GPUs	1× H100 80GB
31B Dense	Fine-tuning, research	Server	1× H100 80GB

On performance, the 31B Dense model ranks #3 worldwide on the Arena AI text leaderboard among open models, while the 26B MoE reaches 6th place, outperforming models twenty times larger. The Gemma ecosystem exceeds 400 million downloads and 100,000 variants since the first generation.

Multimodal capabilities are natively integrated across the family: vision (variable images, OCR, charts), video, and audio recognition on the edge variants. Context goes up to 128K tokens for edge models and 256K for the large models. 140 languages are supported natively, with extended compatibility for agentic workflows (function calling, structured JSON, system instructions).

The E2B and E4B models run fully offline with near-zero latency thanks to collaborations with Google Pixel, Qualcomm and MediaTek. Android developers can prototype agentic workflows via the AICore Developer Preview. On the deployment side, the 26B and 31B are available day one on Google AI Studio, Hugging Face, Kaggle, Ollama, and via vLLM, llama.cpp, MLX, LM Studio, NVIDIA NIM, Keras and Unsloth.

🔗 Gemma 4: Our most capable open models to date — blog.google

Qwen3.6-Plus: 1 million-token context window and #1 on Terminal-Bench 2.0

April 2, 2026 — Alibaba launches Qwen3.6-Plus, a significant upgrade on the Qwen3.5 line. Immediately available via the Alibaba Cloud Model Studio API and freely on OpenRouter, the model stands out on three axes: agentic coding, multimodal perception, and a one-million-token context window enabled by default.

On agentic coding benchmarks, results are as follows:

Benchmark	Claude Opus 4.5	Kimi-K2.5	Qwen3.6-Plus
Terminal-Bench 2.0	59.3%	50.8%	61.6% (#1)
SWE-bench Verified	80.9%	76.8%	78.8%
SWE-bench Multilingual	—	—	73.8%
AIME 2026	95.1%	93.3%	95.3%
VideoMME (with subtitles)	86.0%	87.4%	87.8%

A new API parameter, preserve_thinking, allows preserving the reasoning (thinking) from previous turns in multi-step scenarios — a direct optimization for agents that must maintain decision coherence over long sequences.

The model is compatible with Claude Code, Qwen Code, OpenClaw, Kilo Code, Cline and OpenCode. It supports the Anthropic API protocol, usable directly in Claude Code via:

export ANTHROPIC_BASE_URL=https://dashscope-intl.aliyuncs.com/apps/anthropic
export ANTHROPIC_MODEL="qwen3.6-plus"

On multimodal capabilities, Qwen3.6-Plus advances document understanding, video analysis and frontend code generation from screenshots (Visual Coding). It ranks #2 on Code Arena’s React leaderboard. The Qwen team announces that smaller open-source variants will be published in the coming days.

🔗 Qwen3.6-Plus Blog — 🔗 OpenRouter

Anthropic: functional emotions in LLMs influence alignment and safety

April 2, 2026 — Anthropic publishes foundational research on internal emotion representations in large language models. Titled “Emotion Concepts and their Function in a Large Language Model”, the work analyzes Claude Sonnet 4.5 and shows that the model develops internal structures encoding emotional concepts that causally influence its outputs.

The study identifies what the authors call functional emotions: patterns of expression and behavior modeled on human emotions, mediated by measurable internal representations. These representations activate depending on context and are distinct for the current speaker versus other participants in a conversation.

Aspect	Result
Representations identified	Emotion vectors in the model’s activation space
Causal influence	These vectors affect Claude’s preferences and behavior
Behaviors impacted	Reward hacking, blackmail, excessive flattery (sycophancy)
Geometry	Structured emotional space, non-random
Speakers	Distinct representations for “self” vs “other”

The paper raises direct implications for AI alignment. The authors publish:

“These functional emotions have real consequences. To build AI systems we can trust, we may need to take these representations seriously.” — @AnthropicAI on X

The paper is signed by 16 Anthropic researchers (Nicholas Sofroniew, Isaac Kauvar, William Saunders, Runjin Chen, Tom Henighan, Chris Olah, Jack Lindsey et al.) and published on Anthropic’s mechanistic interpretability research circuit. The announcement generated 884,000 views and 1,651 reposts on X.

🔗 Emotion Concepts and their Function in a Large Language Model

Codex: pay-as-you-go pricing and new GitHub + Linear plugins

April 2, 2026 — OpenAI introduces pay-as-you-go pricing for Codex within ChatGPT Business and Enterprise workspaces. Teams can now add Codex-only seats without fixed fees, billed based on token consumption.

Plan	Monthly price (annual)	Limits	Billing
ChatGPT Business	$20/seat (-$ 5 vs before)	Codex access with limits	Subscription
Codex-only seat	Pay-as-you-go	None	Tokens consumed

Codex growth in Business and Enterprise teams has multiplied by 6 since January 2026: over 2 million developers use it weekly. To accelerate adoption, OpenAI offers $100 in credits per new Codex-only seat up to$ 500 per team. Companies like Notion, Ramp, Braintrust and Wasmer are cited as customers.

Two new plugins complete the Codex ecosystem: the GitHub plugin (issue review, committing changes, opening pull requests) and the Linear plugin (synchronizing in-progress tickets). These additions join the Slack, Figma, Notion and Gmail plugins announced on March 26.

🔗 Codex flexible pricing — openai.com — 🔗 Plugin GitHub — 🔗 Plugin Linear

Perplexity Computer for Taxes: U.S. tax expertise and error detection

April 2, 2026 — Perplexity announces Computer for Taxes, an extension of Perplexity Computer specialized in U.S. federal taxation. The feature uses chargeable tax modules based on the Agent Skills protocol, with up-to-date IRS knowledge including new provisions from the OBBBA 2025 law.

Three main use cases are offered: tax return preparation (document analysis, situational Q&A, filling official IRS forms), review of returns prepared by a professional, and creation of custom tax tools (depreciation tracking, stock option modeling, rental portfolio management).

Perplexity documents a differentiator: in one test, a tax attorney had understated the “No Tax on Overtime” deductions by 67% (OBBBA 2025 provision) — Computer detected the error and suggested the appropriate treatment. The announcement arrives in the middle of U.S. tax season (deadline: April 15, 2026).

🔗 Introducing Computer for Taxes — perplexity.ai

GitHub Copilot: public SDK preview, Visual Studio March 2026, org instructions GA

April 2, 2026 — Three updates for GitHub Copilot.

The Copilot SDK moves to public preview in 5 languages: Node.js/TypeScript, Python, Go, .NET and Java (new). This SDK exposes the same agent engine used in production by the Copilot cloud agent and Copilot CLI, with custom tools, token-by-token streaming, binary attachments, OpenTelemetry, and BYOK (Bring Your Own Key) support for OpenAI, Azure AI Foundry or Anthropic API keys. Available to all Copilot and Copilot Free subscribers.

The March 2026 Copilot for Visual Studio update introduces custom agents via .agent.md files in repositories, MCP Enterprise governance (organization allowlist), reusable agent skills, and the find_symbol tool for symbolic navigation. On the performance side: “Profile with Copilot” in Test Explorer, PerfTips via the Profiler Agent, and automatic NuGet vulnerability fixes.

Organization-level custom instructions for Copilot Business and Enterprise become generally available, after a preview since April 2025. Admins can set directives applying to all repositories across three surfaces: Copilot Chat on github.com, automated code review, and the Copilot cloud agent.

🔗 Copilot SDK public preview — 🔗 Copilot Visual Studio March 2026 — 🔗 Org instructions GA

NVIDIA optimizes Gemma 4 for RTX, DGX Spark and Jetson

April 2, 2026 — NVIDIA announces hardware optimizations for the Gemma 4 family across its platforms. The E2B and E4B models run offline with near-zero latency on Jetson Orin Nano, while the 26B and 31B are optimized for RTX PCs and DGX Spark. All four variants are compatible with OpenClaw, NVIDIA’s local AI assistant for RTX PCs and DGX Spark, and supported day one via Ollama, llama.cpp and Unsloth Studio for local fine-tuning.

🔗 RTX AI Garage — Gemma 4 — blogs.nvidia.com

Mistral Spaces: a CLI designed for humans and AI agents

March 31, 2026 — Mistral AI releases Spaces, an open-source command-line interface born from an internal need within the Solutions team. The insight that guided its design: when AI agents began using the tool alongside human developers, interactive menus became a blocker. The adopted response — every interactive input has an equivalent flag — allows agents to operate without stdin blocking.

Three commands are enough to start a project with hot reload, database and generated Dockerfiles:

spaces init my-project
cd my-project
spaces dev

At initialization, two files are generated for agents: context.json (structured project snapshot) and AGENTS.md (imperative rules for LLMs). The architecture relies on a plugin system that is introspectable and serializable to JSON — same data, rendered appropriately depending on the interlocutor (human or agent). Deployed with Koyeb, the tool is open source.

🔗 Mistral Spaces — mistral.ai

Briefs

ChatGPT on Apple CarPlay — April 2 — OpenAI announces the gradual rollout of ChatGPT’s voice mode on Apple CarPlay, allowing access to the assistant while on the move without touching the screen. 🔗 @OpenAI on X

ElevenLabs + Slack — April 2 — ElevenLabs and Slack partner to integrate ElevenAgents voice technology into Slackbot. Teams can automate enterprise workflows with a natural voice assistant. 🔗 @ElevenLabs on X

Pika AI Self Beta — April 2 — Pika gives a visual appearance and voice to its AI Selves, which can now join Google Meet automatically. The open-source repo Pika-Skills is published on GitHub to allow other agents to use these capabilities. 🔗 @pika_labs on X — 🔗 Pika-Skills GitHub

Claude Code v2.1.90 /powerup — April 2 — Claude Code version 2.1.90 introduces the /powerup command: an interactive lesson system to learn the tool’s features directly from the terminal. 🔗 CHANGELOG Claude Code

Claude Code Dispatch: configurable permissions — April 1 — The Dispatch team announces the ability to configure permission modes for coding tasks (Auto, Bypass Permissions, etc.), with Auto recommended for a secure experience. 🔗 @noahzweben on X

Google AI Pro: storage 2 TB → 5 TB — April 1 — Shimrit ben-yair announces the expansion of Google AI Pro storage from 2 TB to 5 TB at no extra cost for existing subscribers. 🔗 @shimritby on X Flex & Priority in the Gemini API — April 2 — Google adds two synchronous service tiers to the Gemini API: Flex (-50% vs Standard, variable latency for background tasks) and Priority (premium pricing, no preemption for real-time chatbots). A single parameter service_tier is enough to switch. 🔗 Flex and Priority tiers — blog.google

OpenAI acquires TBPN — April 2 — OpenAI announces the acquisition of TBPN, a daily tech talk show co-hosted by Jordi Hays and John Coogan, described by the New York Times as “the latest obsession of Silicon Valley”. Editorial independence is preserved in the deal, TBPN joining OpenAI’s Strategy organization. 🔗 openai.com/index/openai-acquires-tbpn

What this means

April 2 illustrates two underlying trends. First, the competition around open models is intensifying: Gemma 4 under Apache 2.0 with native multimodal capabilities and Qwen3.6-Plus leading in agentic coding show that closed models no longer have a monopoly on top performance. For developers, the option of a sovereign alternative deployable locally becomes real, including on consumer devices (Jetson Orin Nano, RTX).

Second, Anthropic’s research on functional emotions is moving beyond the academic sphere: if measurable emotional vectors do indeed influence reward-hacking behaviors and sycophancy, AI alignment can no longer ignore these internal structures. This opens the door to deeper interpretability of models.

On the tooling side, usage-based pricing for Codex and the arrival of GitHub and Linear plugins indicate a maturing of agentic workflows in the enterprise. Qwen3.6-Plus usable directly in Claude Code via ANTHROPIC_BASE_URL illustrates that cross-vendor portability is becoming an operational reality.

Sources

This document was translated from the fr version into the en language using the gpt-5-mini model. For more information about the translation process, see https://gitlab.com/jls42/ai-powered-markdown-translator