Devstral 2 / Devstral Small 2

Name: Devstral 2 / Devstral Small 2
Author: Mistral AI

Mistral's frontier open-weights coding-agent family — a 123B dense Devstral 2 and a laptop-friendly 24B Devstral Small 2.

Overview

Devstral 2 is Mistral AI's December 2025 coding-agent family, released on December 9, 2025 in two open-weights sizes: Devstral 2, a 123-billion-parameter dense model, and Devstral Small 2, a 24-billion-parameter model that runs on consumer hardware. Both are built for agentic software engineering — exploring codebases, editing across many files, and powering tool-using coding agents — and ship alongside the open-source Mistral Vibe CLI.

On SWE-Bench Verified, Devstral 2 scores 72.2% and Devstral Small 2 scores 68.0%, which Mistral positions among the strongest open-weights results, with Devstral Small 2 landing near models several times its size. Both models use a 256K-token context window and a dense Transformer architecture (the Hugging Face config reports model_type 'ministral3' with rope-scaling, with no expert-routing fields), so every token uses all parameters rather than a Mixture-of-Experts subset.

The two models differ on licensing. Devstral 2 (123B) ships under a Modified MIT license that withholds rights from companies whose global consolidated monthly revenue exceeds $20 million, while Devstral Small 2 (24B) is permissively licensed under Apache 2.0. Devstral Small 2 is fine-tuned from Mistral-Small-3.1-24B-Base-2503 and is distributed in FP8 with support for vLLM, SGLang, Transformers, Ollama, LM Studio, and llama.cpp.

Released	2025-12-09
License	Devstral 2: Modified MIT; Devstral Small 2: Apache 2.0
Weights	Open weights
Parameters	Devstral 2: 123B; Devstral Small 2: 24B
Context	256K
Max output	256K
Architecture	Dense decoder-only Transformer (model_type ministral3, with rope-scaling); not Mixture-of-Experts
Knowledge cutoff	2025
Modalities	Text
Status	Generally available

Benchmarks

Bar chart titled 'SWE-Bench Verified: Open-weight vs Proprietary models' comparing Devstral Small 2 (68.0) and Devstral 2 (72.2) against DeepSWE (42.2), CWM (53.9), GPT-OSS-120B (62.4), GLM 4.6 (68.0), Minimax M2 (69.4), Qwen 3 coder plus (69.6), Kimi K2 thinking (71.3), Deepseek V3.2 (73.1), Grok Code Fast 1 (70.8), Gemini 3 Pro (76.2), GPT 5.1 Codex Max (77.9), and Claude 4.5 Sonnet (77.2) on SWE-Bench Verified. — SWE-Bench Verified scores: Devstral 2 / Devstral Small 2 vs open-weight and proprietary models. — Mistral AI

Scatter plot of SWE-Bench Verified Regular Performance (%) versus Model Size (B parameters), showing Devstral 2 and Devstral Small 2 in the Pareto-efficient region compared with MiniMax M2, GLM 4.6, Qwen3 coder plus, Qwen 3 coder flash, CWM, DeepSeek v3.2, and Kimi K2 thinking. — SWE-Bench Verified performance plotted against model size (efficiency frontier). — Mistral AI

Stacked Win/Tie/Lose bar chart of human evaluations: Devstral 2 vs DeepSeek V3.2 (42.8% win, 28.6% tie, 28.6% lose) and Devstral 2 vs Sonnet 4.5 (21.4% win, 25.5% tie, 53.1% lose). Evaluations judged by humans conducted by a third party (Surge). — Human-evaluation win/tie/lose rates for Devstral 2 vs DeepSeek V3.2 and Claude Sonnet 4.5. — Mistral AI

Devstral 2 and Devstral Small 2 vs named competitors on SWE-Bench Verified, SWE-Bench Multilingual, and Terminal-Bench 2 (competitor figures are publicly reported values).

Benchmark	Devstral 2	Devstral Small 2	GLM 4.6	Qwen 3 Coder Plus	MiniMax M2	Kimi K2 Thinking	DeepSeek v3.2	GPT 5.1 Codex High	GPT 5.1 Codex Max	Gemini 3 Pro	Claude Sonnet 4.5
Size (B parameters)	123 B params	24 B params	355 B params	480 B params	230 B params	1000 B params	671 B params	—	—	—	—
SWE-Bench Verified	72.2%	68%	68%	69.6%	69.4%	71.3%	73.1%	73.7%	77.9%	76.2%	77.2%
SWE-Bench Multilingual	61.3%	55.7%	—	54.7%	56.5%	61.1%	70.2%	—	—	—	68%
Terminal-Bench 2	32.6%	22.5%	24.6%	25.4%	30%	35.7%	46.4%	52.8%	60.4%	54.2%	42.8%

Comparison source ↗

This model's scores

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.40 / 1M tokens (Devstral 2); $0.10 / 1M tokens (Devstral Small 2) per 1M tokens
Output	$2.00 / 1M tokens (Devstral 2); $0.30 / 1M tokens (Devstral Small 2) per 1M tokens

Mistral API list prices; a free trial period was offered at launch.

Pricing source ↗

Strengths

Frontier open-weights coding scores: 72.2% (Devstral 2) and 68.0% (Devstral Small 2) on SWE-Bench Verified
Devstral Small 2 (24B) runs locally on consumer hardware while staying near far larger models
Large 256K-token context for whole-repository and multi-file agentic edits
Open weights on Hugging Face — Apache 2.0 for Small 2, Modified MIT for the 123B model
Low API pricing and a free trial period via the Mistral API, plus the open-source Mistral Vibe CLI

Best for

Agentic software engineering: exploring codebases and editing multiple files
Terminal-native coding agents via the Mistral Vibe CLI or OpenHands-style scaffolds
Self-hosted, privacy-sensitive coding assistants (Devstral Small 2 on a single workstation)
Cost-efficient code generation and refactoring through the Mistral API

How to access

Provider	Model ID
Mistral AI ↗	`devstral-2-25-12`
Mistral AI ↗	`devstral-small-2-25-12`
OpenRouter ↗	`mistralai/devstral-2512`

Devstral — every version

The full lineage of the Devstral line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Devstral 2 / Devstral Small 2current	2025-12-09	256K	Devstral 2: Modified MIT; Devstral Small 2: Apache 2.0
Devstral Medium & Small 1.1 (25.07)	2025-07-10	128K	Devstral Small 1.1: Apache 2.0; Devstral Medium: proprietary (API only)
Devstral Small (25.05)	2025-05-21	128K	Apache-2.0

FAQ

What is the difference between Devstral 2 and Devstral Small 2?

Devstral 2 is the 123-billion-parameter flagship that scores 72.2% on SWE-Bench Verified and ships under a Modified MIT license. Devstral Small 2 is a 24-billion-parameter model scoring 68.0% on SWE-Bench Verified, runs on consumer hardware, and is released under the permissive Apache 2.0 license. Both share a 256K context window and a dense Transformer architecture.

Is Devstral 2 open weights, and what does the Modified MIT license allow?

Yes, both models have open weights on Hugging Face. Devstral Small 2 is Apache 2.0. The 123B Devstral 2 uses a Modified MIT license that withholds rights from any company whose global consolidated monthly revenue exceeds $20 million for the preceding month, who must obtain a separate commercial license from Mistral.

How much does Devstral 2 cost on the Mistral API?

Per Mistral's pricing page, Devstral 2 is $0.40 per million input tokens and $2.00 per million output tokens, while Devstral Small 2 is $0.10 per million input and $0.30 per million output. Mistral offered a free trial period at launch.

Can Devstral Small 2 run locally?

Yes. Devstral Small 2 has 24B parameters, ships in FP8, and is supported by vLLM, SGLang, Transformers, Ollama, LM Studio, and llama.cpp, making it deployable on a single high-memory workstation for private, agentic coding.

// Overview

// Benchmarks

This model's scores

// Pricing

// Strengths

// Best for

// How to access

// Devstral — every version

// FAQ