MiniMax M2.5 / M2.5-Lightning

Name: MiniMax M2.5 / M2.5-Lightning
Author: MiniMax

An open-weight 229B MoE agentic model that runs frontier coding tasks at roughly one-twentieth the price of closed rivals.

Overview

MiniMax M2.5 is an open-weight large language model from Shanghai-based MiniMax, released on February 12, 2026. It uses a sparse Mixture-of-Experts design with roughly 229B total parameters but only about 10B active per token, which keeps inference cheap while preserving frontier-level quality on coding, tool use, web search, and office-document tasks. The model is text-only; MiniMax handles vision and audio through separate models.

The release shipped in two variants that share the same weights and capability but differ in speed: the standard MiniMax M2.5 and a faster M2.5-Lightning. MiniMax positioned both as built for real-world productivity, trained with reinforcement learning across many complex digital working environments rather than tuned purely for static benchmarks. It posts 80.2% on SWE-bench Verified and 76.3% on BrowseComp, and lands around 42 on the Artificial Analysis Intelligence Index — competitive with much larger closed models.

The headline pitch is price. At launch MiniMax priced M2.5 at roughly one-tenth to one-twentieth the cost of comparable closed frontier models such as Claude Opus 4.6, making sustained agentic use economically viable. The weights are published on Hugging Face under a modified MIT license that requires commercial users to display the 'MiniMax M2.5' name in their interface. It has since been succeeded in MiniMax's lineup by later M-series models (M2.7), and on the official API M2.5 is now listed among legacy models.

Released	2026-02-12
License	Modified MIT (attribution required for commercial use)
Weights	Open weights
Parameters	229B total / 10B active (MoE)
Context	205K
Max output	~131K tokens
Architecture	Sparse Mixture-of-Experts (MoE) transformer with about 229B total parameters and roughly 10B activated per token. Trained with large-scale reinforcement learning across many real-world digital working environments, with a reasoning ("thinking") mode for agentic and coding tasks. Released in two interchangeable variants — a standard model and a faster M2.5-Lightning — that share weights and capability but differ in serving speed.
Knowledge cutoff	Not disclosed
Modalities	Text
Status	Superseded

Benchmarks

SWE-bench Verified80.2%
Multi-SWE-Bench51.3%
BrowseComp76.3%
GPQA Diamond85.2%
AIME 202586.3%
MMLU-Pro80.1%
Artificial Analysis Intelligence Index42%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.15 / 1M tokens (M2.5); $0.30 / 1M tokens (M2.5-Lightning) per 1M tokens
Output	$1.20 / 1M tokens (M2.5); $2.40 / 1M tokens (M2.5-Lightning) per 1M tokens

Launch pricing from MiniMax. Both variants support prompt caching. M2.5 is now listed among legacy models on the MiniMax API; third-party providers (OpenRouter, SiliconFlow, Fireworks, Novita) may price it differently.

Pricing source ↗

Strengths

Strong agentic coding: 80.2% on SWE-bench Verified, competitive with closed frontier models
Very low cost per token relative to closed rivals, enabling long-running agent workloads
Open weights under a permissive (modified MIT) license, downloadable and self-hostable
Efficient MoE design — about 10B active parameters keeps serving cheap and fast
Two serving tiers: standard M2.5 and a higher-throughput M2.5-Lightning variant
Long 205K-token context window for large codebases and multi-step agent traces
Built for real-world office and tool-use tasks, not just static benchmarks

Best for

Autonomous and semi-autonomous coding agents running across large codebases
Cost-sensitive, long-horizon agent workflows where token volume is high
Web-research and browsing agents (strong BrowseComp performance)
Self-hosted or private deployment where open weights are a requirement
Office and document-generation tasks (Word, Excel, PowerPoint workflows)
Multilingual programming across Python, Go, C++, Rust, and TypeScript

How to access

Provider	Model ID
MiniMax ↗	`MiniMax-M2.5`
OpenRouter ↗	`minimax/minimax-m2.5`
Hugging Face ↗	`MiniMaxAI/MiniMax-M2.5`

MiniMax M-Series — every version

The full lineage of the MiniMax M-Series line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
MiniMax M3current	2026-06-01	1M	MiniMax Community
MiniMax M2.7 / M2.7-highspeed	2026-03-18	—	Open weights
MiniMax M2.5 / M2.5-Lightning	2026-02-12	—	Open weights
MiniMax M2.1	2025-12-23	—	Open weights
MiniMax M2	2025-10-27	—	MIT

FAQ

What is the difference between MiniMax M2.5 and M2.5-Lightning?

They share the same weights and capabilities and differ only in serving speed and price. The standard M2.5 runs around 50 tokens per second at $0.15/$1.20 per million input/output tokens, while M2.5-Lightning roughly doubles throughput to about 100 tokens per second at $0.30/$2.40 per million tokens.

Is MiniMax M2.5 open source?

The weights are open and published on Hugging Face under a modified MIT license. It is open-weight: free to download and self-host, but the license requires that commercial users prominently display the 'MiniMax M2.5' name in their user interface.

How many parameters does MiniMax M2.5 have?

It is a Mixture-of-Experts model with about 229B total parameters but only roughly 10B activated per token, which is what makes its inference unusually cheap for a model of its quality.

Is MiniMax M2.5 still current?

No. M2.5 launched in February 2026 and has since been succeeded by later M-series releases (such as M2.7). On MiniMax's official API it is now listed among legacy models, though it remains downloadable from Hugging Face and available through third-party providers.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// MiniMax M-Series — every version

// FAQ