Grok Build in beta, Anthropic at the Vatican, ElevenLabs Music v2

May 25 and 26, 2026 mark xAI’s smashing entry into the CLI agent market with Grok Build (40.8 million views), while Anthropic publishes an engineering post on agent safety and its co-founder speaks at the Vatican. At the same time, ElevenLabs releases Music v2 with prices cut in half, Runway claims to have crossed the “uncanny valley” for AI video, and OpenAI and Alibaba both publish notable updates to their agent tools.

Grok Build — xAI launches its code agent in the terminal

May 25, 2026 — xAI launched Grok Build in beta, a coding agent running directly from the terminal. The announcement generated 40.8 million views on X in just a few hours — one of xAI’s most viral posts in months.

Grok Build is now available in Beta for all SuperGrok and X Premium+ users. Use Plan Mode, create images and videos with Imagine, and build automations or orchestrators with the CLI. Visit x.ai/cli to get started. — @xai on X

The tool installs with a single command: curl -fsSL https://x.ai/cli/install.sh | bash

Access is immediate for SuperGrok and X Premium+ subscribers, with no extra subscription cost.

Main features

Feature	Description
Plan Mode	Structured planning before any change — every modification blocked until explicit approval
Skills	Reusable workflows (AGENTS.md, plugins, hooks, MCP), invocable automatically or by name — `/skillify` to create
Subagents	Specialized agents run in parallel for research, building, and review
Plugins	Shared marketplace: Linear, Sentry, Postgres, browsers via MCP
Git integration	Stage, commit, push, branch management from the terminal
Code review	Line-by-line feedback before opening a PR
Memory	Persistence of decisions across sessions
Headless mode	Usable in CI/CD pipelines

The architectural resemblance to Claude Code is striking: AGENTS.md, hooks, MCP, subagents, worktrees. This convergence confirms that the CLI agent category is establishing itself as a standard piece of AI developer tooling.

🔗 Grok Build product page

Chris Olah (Anthropic) at the Vatican on Pope Leo XIV’s encyclical

May 25, 2026 — Pope Leo XIV published an encyclical titled “Magnifica humanitas : On safeguarding the human person in the time of artificial Intelligence”. Chris Olah, co-founder of Anthropic, was invited to speak at the presentation ceremony at the Vatican.

In his speech, Olah addresses three angles: the commercial and geopolitical pressures weighing on AI labs, questions of global justice (the concentration of AI gains in a few wealthy nations), and the nature of the models themselves. On this last point, he speaks cautiously:

“[W]e keep finding things that are mysterious, even unsettling. We find structures that mirror results from human neuroscience. We find evidence of introspection. We find internal states that functionally mirror joy, satisfaction, fear, grief, and unease. I don’t know what that means, but I think it warrants ongoing discernment.” — Chris Olah, speech at the Vatican, May 25, 2026

The @AnthropicAI tweet announcing the article generated more than 1 million views — exceptional engagement for institutional content.

🔗 @AnthropicAI tweet

Anthropic Engineering — How to contain Claude agents

May 26, 2026 — Anthropic publishes a detailed engineering post on its Claude agent containment strategy, signed by five engineers. The article compares three architectures deployed across three products, with real incidents and concrete metrics.

Product	Isolation mechanism	Blast radius
claude.ai	Ephemeral container (gVisor)	Server, tenant isolation
Claude Code	Seatbelt (macOS) / bubblewrap (Linux), network blocked by default	Local workspace
Claude Cowork	Full VM (Apple/HCS hypervisor)	User-mounted workspace

Three documented real incidents: a pre-trust dialog hook vulnerability (Claude Code, mid-2025), a phishing prompt injection case with 24 successful exfiltrations out of 25 attempts, and an exfiltration via an approved domain in Claude Cowork.

Published security metrics: Claude Opus 4.7 achieves a 0.1% attack success rate on a single attempt (Gray Swan Agent Red Teaming), 5-6% after 100 adaptive attempts. Claude Code’s auto mode catches 83% of overly permissive behaviors before execution.

🔗 Engineering Blog article

GitHub Copilot — Model rules by organization

May 26, 2026 — GitHub launches targeted model rules for GitHub Copilot in public preview. Enterprise administrators can now define which organization can access which Copilot model, instead of a single enterprise-wide setting.

Each model can be configured as Enabled (enabled for all organizations) or Optional (each organization decides). The default availability management interface has been fully redesigned. The feature is available for Copilot Business and Copilot Enterprise.

🔗 GitHub Changelog

Manus Projects available on mobile

May 25, 2026 — Manus announces the availability of Projects on its mobile app. The feature covers everything from simple task management to advanced workflows with shared files, instructions, skills, and connectors.

The launch tweet (48,388 views, 574 likes) states: “Projects are more than folders. Teach Manus how you want work done.” Projects make it possible to encode work preferences — recurring instructions, reference files, connectors — so the agent applies them automatically to new tasks.

🔗 @ManusAI announcement

Runway Project Luxo — crossing the uncanny valley

May 26, 2026 — Runway publishes Project Luxo, a research report accompanied by three 100% AI-generated short films, screened for film professionals. Result: all participants judged that the films “worked” emotionally.

Title	Duration	Team	Production time
The Rogue	9:57	1 person	3 weeks
Last Night	5:28	1 person	7 hours
Pigeons in Time	0:46	1 person	4 hours

The name refers to Luxo Jr. (Pixar, SIGGRAPH 1986), a short film that marked the shift toward credible 3D animation. Runway says it is crossing an equivalent threshold for AI video. A fictional ad posted in April had already surpassed 10 million views in 48 hours on Instagram.

🔗 Project Luxo — Runway

ElevenLabs Music v2 — improved quality, prices cut in half

May 26, 2026 — ElevenLabs launches Music v2, immediately available on ElevenMusic and ElevenCreative (ElevenAPI coming soon). The new model improves multi-genre vocal and orchestral quality, inpainting (regenerating isolated sections), section-by-section composition, and multilingual support.

Platform	Use
ElevenMusic	Creator studio: create, remix, develop
ElevenAPI	Model access for developers
ElevenCreative	Licensed music for brands and video content

Prices drop by -50% for ElevenAPI and -40% for ElevenCreative (self-serve customers). Every generated track is cleared for commercial use. The model is trained only on licensed data, with a Believe partnership.

🔗 ElevenLabs announcement

AgentScope 2.0 — Alibaba publishes a production framework for agents

May 26, 2026 — Tongyi Lab (Alibaba) publishes AgentScope 2.0, an open-source framework for deploying AI agents in production. The stated goal: move from “I know what my agent does” to “I know my agent will complete the task”.

Feature	Description
Retry / fallback	Automatic switching between models if one fails
Permission system	Fine-grained control over the agent’s allowed actions
Execution streaming	Real-time monitoring of the agent’s actions

Available in Python and TypeScript (Java announced soon), with dedicated documentation on docs.agentscope.io/v2.

🔗 @agentscope_ai announcement

Codex CLI 0.134.0 — OpenAI improves MCP and history

May 26, 2026 — OpenAI releases Codex CLI version 0.134.0 with six new features. Search in the local conversation history (case-insensitive, with result previews) makes it easier to navigate past sessions. Profile management is unified under a single --profile flag for CLI, TUI, and sandbox.

On the MCP side, servers can now target specific environments and use OAuth options for streamable HTTP servers. Annotated readOnlyHint MCP tools now run in parallel. Hooks receive enriched context including conversation history and the sub-agent identity.

Notable fixes: Windows TUI rendering corruption resolved, and usage-limit error messages are now workspace-specific.

🔗 Codex CLI 0.134.0 changelog

What this means

The release of Grok Build illustrates the rapid consolidation of the CLI agent market. In just a few months, Claude Code (Anthropic), Codex CLI (OpenAI), GitHub Copilot CLI, and now Grok Build have converged on the same architecture: per-directory convention files (CLAUDE.md / AGENTS.md), hooks, MCP integration, parallel subagents. Competition is shifting toward the quality of the underlying models, production reliability, and the plugin ecosystem — not the architecture, which has become a de facto standard.

The simultaneous publication of Anthropic’s containment article and the launch of Grok Build reveals a central tension of the moment: CLI agents are becoming more powerful (system access, code execution, git, CI/CD) while the community is beginning to seriously document the risks. Anthropic’s 24 successful exfiltrations out of 25 in the phishing test and Claude Code’s pre-trust dialog vulnerability are reminders that terminal agent security remains an open problem. The publication of concrete metrics (Gray Swan 0.1%, auto capture 83%) represents a step toward transparency on this topic.

Chris Olah’s speech at the Vatican is part of a broader movement: AI lab researchers are engaging in dialogue with non-technical institutions (the Church, governments, civil society) on questions that technology alone cannot solve. The question of the nature of models — internal states, introspection, forms of functional consciousness — is moving beyond research circles and into public debate. The papal encyclical “Magnifica humanitas” is a signal that these questions have now reached the highest level of global moral institutions.

ElevenLabs’ price cuts (-50% API, -40% Creative) and Runway’s films produced by a single person in a few hours point in the same direction: professional-quality creative media generation is becoming accessible to individual creators. Project Luxo and Music v2 are not technical announcements in the strict sense — they are demonstrations that the tools have crossed a threshold of usability for real professional uses.