Overview
Mistral Medium 3.5 is Mistral AI's flagship model, released on 28 April 2026. It is a dense 128-billion-parameter model with a 256k-token context window that accepts both text and images. Unlike most of Mistral's earlier lineup, it is published as open weights on Hugging Face under a modified MIT license, so teams can self-host it on as few as four GPUs as well as call it through the Mistral API.
What makes Mistral Medium 3.5 notable is that it is a "merged" model. Mistral retired several specialist models — Mistral Medium 3.1 for general chat, Magistral for reasoning and Devstral 2 for coding — and folded their capabilities into this single set of weights. Reasoning effort is configurable per request, so the same model can return a quick answer or work through a long, multi-step agentic task. Its vision encoder was trained from scratch to handle variable image sizes and aspect ratios.
Mistral Medium 3.5 is the default model in Le Chat and powers the remote coding agents in Mistral's Vibe CLI, where agents run in the cloud, in parallel, and notify you when they finish. It is priced at $1.50 per million input tokens and $7.50 per million output tokens through the Mistral API, and is also distributed via Hugging Face, Ollama and NVIDIA NIM endpoints for self-hosting.
| Released | 2026-04-28 |
|---|---|
| License | Modified MIT |
| Weights | Open weights |
| Parameters | 128B (dense) |
| Context | 256K |
| Architecture | Dense transformer with a vision encoder trained from scratch to handle variable image sizes and aspect ratios. Reasoning effort is configurable per request, so the same weights serve both quick chat replies and long agentic runs. Mistral describes it as its first "merged" flagship: a single dense 128B model that folds in the jobs previously split across Mistral Medium 3.1 (general instruction-following), Magistral (reasoning) and Devstral 2 (coding). Self-hosting is possible on as few as four GPUs. |
| Modalities | Text, Vision |
| Status | Available |
Benchmarks


Agentic Benchmarks vs competing models (scores transcribed from Mistral AI's published bar chart)
| Benchmark | Mistral Medium 3.5 128B | Claude Sonnet 4.5 | Claude Sonnet 4.6 | Kimi K2.5 1T A32B | GLM-5.1 744B A40B | Qwen3.5 397B A17B |
|---|---|---|---|---|---|---|
| SWE-Bench Verified | 77.6% | 77.2% | 79.6% | 76.8% | 80.2% | 76.4% |
| tau-3 Telecom | 91.4 score | 84.9 score | 70.4 score | 86.8 score | 98.7 score | 97.8 score |
| tau-3 Airline | 72 score | 72 score | 83 score | 76.5 score | 79.5 score | 81.5 score |
| tau-3 Retail | 76.1 score | 72.4 score | 75.9 score | 72.8 score | 76.3 score | 84.4 score |
| tau-3 Banking | 13.4 score | 22.4 score | 28.4 score | 14.9 score | 16.2 score | 9.8 score |
| BrowseComp | 48.6 score | 43.9 score | 74.7 score | 74.9 score | 79.3 score | 78.6 score |
This model's scores
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $1.50 / 1M tokens per 1M tokens |
|---|---|
| Output | $7.50 / 1M tokens per 1M tokens |
Mistral API (la Plateforme) list price. Open weights are also available for self-hosting at no per-token cost.
Strengths
- Single open-weight model that handles chat, reasoning, vision and coding, removing the need to route between specialist models
- Strong real-world coding: 77.6% on SWE-Bench Verified, ahead of the coding-specialist Devstral 2 it replaces
- Configurable per-request reasoning effort lets one deployment cover fast chat and deep agentic work
- 256k-token context window for long documents and codebases
- Self-hostable on as few as four GPUs, with open weights under a modified MIT license
- Native vision input via an encoder trained from scratch for varied image sizes and aspect ratios
Best for
- Autonomous and remote coding agents (it powers Mistral Vibe's cloud agents)
- Resolving real GitHub issues and generating code patches
- Agentic tool-use and workflow automation over long contexts
- General enterprise chat and instruction-following as a Le Chat default model
- Multimodal document and image understanding
- Self-hosted private deployments where open weights and data control matter
How to access
| Provider | Model ID |
|---|---|
| Mistral AI ↗ | mistral-medium-3-5-26-04 |
Mistral Medium — every version
The full lineage of the Mistral Medium line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Mistral Medium 3.5current | 2026-04-28 | — | MIT |
| Mistral Medium 3.1 | 2025-08-12 | — | Open weights |
| Mistral Medium 3 | 2025-05-07 | — | Open weights |
| Mistral Medium (2023) | 2023-12 | — | Proprietary |
FAQ
Is Mistral Medium 3.5 open source?
It is open-weight, not fully open source. Mistral published the weights on Hugging Face under a modified MIT license, so you can download and self-host the model (on as few as four GPUs), but the license carries Mistral's own modifications rather than being a standard unrestricted MIT release.
What models does Mistral Medium 3.5 replace?
Mistral describes it as a "merged" flagship that folds three earlier specialist models into one set of weights: Mistral Medium 3.1 for general instruction-following, Magistral for reasoning, and Devstral 2 for coding. A single per-request reasoning-effort setting lets it cover both quick chat and deep agentic work.
How good is Mistral Medium 3.5 at coding?
It scores 77.6% on SWE-Bench Verified, which measures resolving real GitHub issues with code patches — ahead of the coding-specialist Devstral 2 it replaces. It also powers the remote coding agents in Mistral's Vibe CLI, which run autonomously in the cloud.
How much does Mistral Medium 3.5 cost?
Through the Mistral API it is priced at $1.50 per million input tokens and $7.50 per million output tokens. Because the weights are open, you can also self-host it with no per-token API cost.