MiniMax releases M2.5, an open-source frontier model reaching 80.2% on SWE-Bench Verified. Kling launches its 3.0 model with 1080p video and realistic dialogue. On the research side, Perplexity deploys Model Council to run three models simultaneously, and runs Deep Research on Claude Opus 4.6. Mistral announces its biggest global hackathon with $200K in prizes.
MiniMax M2.5 — open-source frontier model
February 12 — MiniMax announces M2.5, an open-source frontier model designed for real-world productivity. The model achieves state-of-the-art performance in four critical areas: coding, web search, agentic tool calls, and office work.
| Benchmark | Score | Category |
|---|---|---|
| SWE-Bench Verified | 80.2% | Real bug fixing |
| BrowseComp | 76.3% | Web search and navigation |
| BFCL | 76.8% | Agentic tool calls |
| Office Work | Optimized | Document productivity |
The 80.2% score on SWE-Bench Verified places M2.5 among the best coding models across all categories. On BrowseComp, OpenAI’s web navigation benchmark, it reaches 76.3% — a sign of solid autonomous search capability.
MiniMax claims 37% faster execution on complex tasks compared to competing models, with a cost of $1 USD per hour at 100 tokens/second. The stated goal: to make scaling long-horizon agents economically viable.
The model is available via MiniMax Agent (agent.minimax.io) and the developer API (platform.minimax.io). As an open-source frontier model, M2.5 positions itself directly against leading proprietary models.
MiniMax Forge — RL framework for production agents
February 12 — Alongside M2.5, MiniMax releases Forge, a scalable reinforcement learning (RL) framework and algorithm for training production AI agents.
Forge addresses a recurring problem in agent training: the instability of learning at scale. The framework offers an optimized approach for agent reward modeling, targeting ML developers and researchers deploying autonomous agents.
The dual announcement of M2.5 + Forge signals MiniMax’s ambition to offer a complete stack for AI agents: frontier model + training framework.
Kling 3.0 — “Everyone a Director”
February 1 — Kling AI launches its 3.0 model, a major update to its video generation engine positioned around the concept “Everyone a Director”. The model aims to make cinematic creation accessible without technical expertise.
Main improvements focus on visual quality and realism of human interactions:
| Capability | Detail |
|---|---|
| Resolution | Native 1080p |
| Dialogue | Realistic facial expressions and gestures |
| Consistency | Visual style maintained over long sequences |
| Flexibility | From simple prompt to full cinematic storyboard |
Feedback from the creative community is positive, especially on dialogue realism and the ability to produce scenes with convincing human interactions — a historical weak point of AI video models.
Perplexity launches Model Council — multi-model search
February 5 — Perplexity deploys Model Council, a feature that executes the same query on three frontier models simultaneously and produces a single synthesized answer.
Instead of manually switching between models, Model Council runs the query on Claude Opus 4.6, GPT 5.2, and Gemini 3.0 in parallel. A synthesizer model analyzes the results, resolves conflicts between answers, and shows where models converge or diverge.
| Use Case | Detail |
|---|---|
| Investment | Balanced market perspectives |
| Complex Decisions | Business strategy, major purchases |
| Brainstorming | Diversified creative ideas |
| Verification | Validate information with increased confidence |
The feature is available immediately on the web for Perplexity Max subscribers. The mobile version is in development.
Perplexity Deep Research moves to Opus 4.6
February 9 — Perplexity announces that Deep Research now runs on Claude Opus 4.6, improving state-of-the-art results on internal and external benchmarks. The upgrade strengthens reasoning capabilities in deep research.
The feature is available immediately for Max users, with a progressive rollout to Pro users.
🔗 Deep Research Opus 4.6 Announcement
Perplexity releases DRACO Benchmark as open-source
February 4 — Perplexity makes DRACO public, an open-source benchmark designed to evaluate deep research tools. The rubrics and full methodology are publicly available.
DRACO validates that Perplexity Deep Research achieves state-of-the-art performance on external benchmarks, surpassing other deep research tools in accuracy and reliability.
Mistral announces its biggest hackathon — $200K in prizes
February 10 — Mistral AI launches its biggest global hackathon ever organized, scheduled for February 28 to March 1, 2026.
| Detail | Information |
|---|---|
| Format | 48 hours |
| Locations | Paris, London, New York, San Francisco, Tokyo, Singapore, Sydney + online |
| Prizes | $200K in rewards |
| Partners | NVIDIA, AWS, Weights & Biases, Hugging Face |
| Special Prizes | ElevenLabs, Hugging Face |
The event takes place simultaneously in 8 cities and online. The list of partners (NVIDIA, AWS, WandB, Hugging Face) signals the confidence of the major AI ecosystem in the Mistral platform.
🔗 Mistral Hackathon Announcement
Cohere signs Magnus Carlsen as ambassador
February 13 — Cohere announces a partnership with Magnus Carlsen, five-time World Chess Champion and world No. 1 player, as global brand ambassador.
Carlsen will participate in visibility campaigns, thought leadership initiatives, and high-profile Cohere events. The partnership aims to illustrate the parallels between chess strategy and Cohere’s approach to enterprise AI: focus on fundamentals, anticipation, and sustainable advantages.
🔗 Cohere + Magnus Carlsen Announcement
In brief
February 12 — Runway launches Story Panels, a new workflow allowing the creation of full films or ads from a single image, with character, location, and style consistency.
February 12-13 — Mooncake, a PyTorch memory allocator co-developed by Moonshot AI (Kimi) and Tsinghua University, joins the PyTorch ecosystem. The tool optimizes memory peak reduction and fragmentation, relevant for long-context LLM deployment.
February 9 — Ideogram highlights its image editing via natural language prompt, allowing modification of generated images via simple text instructions.
January 30 — Perplexity integrates Kimi K2.5, Moonshot AI’s open-source reasoning model, for its Pro and Max subscribers. Inference runs on Perplexity’s own infrastructure in the US.
February 4 — MiniMax and Hyperbond Studio announce a partnership to develop conversational AI companions with “Call Me Sensei”, using MiniMax LLM and agent APIs.
What this means
The first half of February 2026 confirms several underlying trends. MiniMax M2.5 proves that a less publicized player can release an open-source model rivaling leaders on coding benchmarks — 80.2% on SWE-Bench Verified is a remarkable score for an open model. With Forge complementing it, MiniMax offers a complete agent stack.
Perplexity accelerates its differentiation with Model Council, a pragmatic approach acknowledging that no single model dominates all use cases. Integrating Opus 4.6 into Deep Research and open-sourcing DRACO reinforce the platform’s transparency and credibility.
Kling 3.0 marks an advance in video generation with realistic dialogues — a step towards accessible cinematic production tools. On the community side, the $200K Mistral hackathon in 8 cities shows the maturity of the European open-source ecosystem.