Overview
Mistral Large 3 is Mistral AI's flagship open-weight model, announced December 2, 2025 as part of the Mistral 3 family. It is a sparse, granular mixture-of-experts (MoE) model trained from scratch with 675 billion total parameters and 41 billion active parameters per forward pass — Mistral's first MoE release since the Mixtral series.
The architecture pairs a ~673B-parameter granular-MoE language model (39B active) with a 2.5B vision encoder, giving native text and image understanding. It supports a 256K-token context window and many languages, and is designed for production assistants, retrieval-augmented systems, and complex enterprise and agentic workflows. It debuted at #2 in the OSS non-reasoning category (#6 among OSS models overall) on the LMArena leaderboard at launch.
It is released under the permissive Apache 2.0 license and is available open-weight on Hugging Face, plus hosted on Mistral AI Studio and providers such as Amazon Bedrock, Azure Foundry, IBM watsonx, Fireworks, and Together AI.
| Released | 2025-12-02 |
|---|---|
| License | Apache-2.0 |
| Weights | Open weights |
| Parameters | 675B total · 41B active |
| Context | 256K |
| Architecture | Mixture-of-Experts |
| Modalities | Text, Vision |
| Status | Available |
Pricing
| Input | $0.50 / 1M tokens |
|---|---|
| Output | $1.50 / 1M tokens |
Strengths
- Permissive Apache-2.0 license — fully open weights for commercial use and self-hosting
- Frontier-class quality at sparse cost: 41B of 675B parameters active per token
- Native multimodality via an integrated 2.5B vision encoder
- Long 256K-token context for long-document and RAG workloads
- Strong multilingual coverage across many languages
Best for
- Production-grade assistants and chat applications
- Retrieval-augmented generation over long documents
- Agentic and instruction-following enterprise workflows
- Self-hosted deployments needing an Apache-2.0 frontier model
How to access
| Provider | Model ID |
|---|---|
| Mistral AI (La Plateforme) ↗ | mistral-large-3-675b-instruct-2512 |
| Hugging Face ↗ | mistralai/Mistral-Large-3-675B-Instruct-2512 |
Mistral Large — every version
The full lineage of the Mistral Large line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Mistral Large 3current | 2025-12-02 | 256K | Apache-2.0 |
| Mistral Large 2.1 (24.11) | 2024-11-18 | — | Open weights |
| Mistral Large 2 (24.07) | 2024-07-24 | — | Open weights |
| Mistral Large (24.02) | 2024-02-26 | — | Proprietary |
FAQ
Is Mistral Large 3 open-weight?
Yes. Mistral Large 3 is released under the permissive Apache 2.0 license, with weights available on Hugging Face (mistralai/Mistral-Large-3-675B-Instruct-2512). Unlike Meta's Llama community license, Apache 2.0 is a standard open-source license, so the weights can be used, modified, and deployed commercially with minimal restrictions, including full self-hosting.
How big is Mistral Large 3?
Mistral Large 3 is a sparse mixture-of-experts model with 675 billion total parameters and 41 billion active parameters per forward pass. The architecture combines a roughly 673B-parameter granular-MoE language model (39B active) with a 2.5B vision encoder for native image understanding. It supports a 256K-token context window.
How much does Mistral Large 3 cost to use via API?
On Mistral's official platform, Mistral Large 3 is priced at $0.50 per million input tokens and $1.50 per million output tokens, per the mistral.ai pricing page. Because it is Apache-2.0 open-weight, you can also self-host the model or run it through third-party providers such as Amazon Bedrock, Azure Foundry, Fireworks, and Together AI.