AI/TLDR

MiniMax M2.5 / M2.5-Lightning

An open-weight 229B MoE agentic model that runs frontier coding tasks at roughly one-twentieth the price of closed rivals.

Overview

MiniMax M2.5 is an open-weight large language model from Shanghai-based MiniMax, released on February 12, 2026. It uses a sparse Mixture-of-Experts design with roughly 229B total parameters but only about 10B active per token, which keeps inference cheap while preserving frontier-level quality on coding, tool use, web search, and office-document tasks. The model is text-only; MiniMax handles vision and audio through separate models.

The release shipped in two variants that share the same weights and capability but differ in speed: the standard MiniMax M2.5 and a faster M2.5-Lightning. MiniMax positioned both as built for real-world productivity, trained with reinforcement learning across many complex digital working environments rather than tuned purely for static benchmarks. It posts 80.2% on SWE-bench Verified and 76.3% on BrowseComp, and lands around 42 on the Artificial Analysis Intelligence Index — competitive with much larger closed models.

The headline pitch is price. At launch MiniMax priced M2.5 at roughly one-tenth to one-twentieth the cost of comparable closed frontier models such as Claude Opus 4.6, making sustained agentic use economically viable. The weights are published on Hugging Face under a modified MIT license that requires commercial users to display the 'MiniMax M2.5' name in their interface. It has since been succeeded in MiniMax's lineup by later M-series models (M2.7), and on the official API M2.5 is now listed among legacy models.

Released2026-02-12
LicenseModified MIT (attribution required for commercial use)
WeightsOpen weights
Parameters229B total / 10B active (MoE)
Context205K
Max output~131K tokens
ArchitectureSparse Mixture-of-Experts (MoE) transformer with about 229B total parameters and roughly 10B activated per token. Trained with large-scale reinforcement learning across many real-world digital working environments, with a reasoning ("thinking") mode for agentic and coding tasks. Released in two interchangeable variants — a standard model and a faster M2.5-Lightning — that share weights and capability but differ in serving speed.
Knowledge cutoffNot disclosed
ModalitiesText
StatusSuperseded

Benchmarks

  1. SWE-bench Verified80.2%
  2. Multi-SWE-Bench51.3%
  3. BrowseComp76.3%
  4. GPQA Diamond85.2%
  5. AIME 202586.3%
  6. MMLU-Pro80.1%
  7. Artificial Analysis Intelligence Index42%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.15 / 1M tokens (M2.5); $0.30 / 1M tokens (M2.5-Lightning) per 1M tokens
Output$1.20 / 1M tokens (M2.5); $2.40 / 1M tokens (M2.5-Lightning) per 1M tokens

Launch pricing from MiniMax. Both variants support prompt caching. M2.5 is now listed among legacy models on the MiniMax API; third-party providers (OpenRouter, SiliconFlow, Fireworks, Novita) may price it differently.

Pricing source ↗

Strengths

  • Strong agentic coding: 80.2% on SWE-bench Verified, competitive with closed frontier models
  • Very low cost per token relative to closed rivals, enabling long-running agent workloads
  • Open weights under a permissive (modified MIT) license, downloadable and self-hostable
  • Efficient MoE design — about 10B active parameters keeps serving cheap and fast
  • Two serving tiers: standard M2.5 and a higher-throughput M2.5-Lightning variant
  • Long 205K-token context window for large codebases and multi-step agent traces
  • Built for real-world office and tool-use tasks, not just static benchmarks

Best for

  • Autonomous and semi-autonomous coding agents running across large codebases
  • Cost-sensitive, long-horizon agent workflows where token volume is high
  • Web-research and browsing agents (strong BrowseComp performance)
  • Self-hosted or private deployment where open weights are a requirement
  • Office and document-generation tasks (Word, Excel, PowerPoint workflows)
  • Multilingual programming across Python, Go, C++, Rust, and TypeScript

How to access

ProviderModel ID
MiniMax ↗MiniMax-M2.5
OpenRouter ↗minimax/minimax-m2.5
Hugging Face ↗MiniMaxAI/MiniMax-M2.5

MiniMax M-Series — every version

The full lineage of the MiniMax M-Series line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
MiniMax M3current2026-06-011MMiniMax Community
MiniMax M2.7 / M2.7-highspeed2026-03-18Open weights
MiniMax M2.5 / M2.5-Lightning2026-02-12Open weights
MiniMax M2.12025-12-23Open weights
MiniMax M22025-10-27MIT

FAQ

What is the difference between MiniMax M2.5 and M2.5-Lightning?

They share the same weights and capabilities and differ only in serving speed and price. The standard M2.5 runs around 50 tokens per second at $0.15/$1.20 per million input/output tokens, while M2.5-Lightning roughly doubles throughput to about 100 tokens per second at $0.30/$2.40 per million tokens.

Is MiniMax M2.5 open source?

The weights are open and published on Hugging Face under a modified MIT license. It is open-weight: free to download and self-host, but the license requires that commercial users prominently display the 'MiniMax M2.5' name in their user interface.

How many parameters does MiniMax M2.5 have?

It is a Mixture-of-Experts model with about 229B total parameters but only roughly 10B activated per token, which is what makes its inference unusually cheap for a model of its quality.

Is MiniMax M2.5 still current?

No. M2.5 launched in February 2026 and has since been succeeded by later M-series releases (such as M2.7). On MiniMax's official API it is now listed among legacy models, though it remains downloadable from Hugging Face and available through third-party providers.