Overview
Mistral Small (24.09), released by Mistral AI on 17 September 2024 under the API name mistral-small-2409, is a 22-billion-parameter dense large language model. It was the September relaunch of Mistral's "Small" tier: the original February 2024 Mistral Small had been pushed to legacy status, and this 24.09 version brought the line back as an enterprise-grade, open-weight model positioned as a midpoint between the smaller Mistral NeMo 12B and the flagship Mistral Large 2.
Compared with the prior Small model, Mistral Small v24.09 shipped with improved human alignment, stronger reasoning, and better code generation, while arriving at roughly an 80% lower list price — $0.20 per million input tokens and $0.60 per million output tokens on la Plateforme. Mistral pitched it at high-volume, cost-sensitive workloads such as translation, summarization, sentiment analysis, and other tasks that do not need a full general-purpose frontier model. It is a text-only model with a 32k-token context window and built-in function calling.
The weights (Mistral-Small-Instruct-2409) were published on Hugging Face under the Mistral Research License, which permits research and non-commercial self-deployment (for example with vLLM) but requires a separate commercial agreement for production use. Mistral AI has since retired the hosted mistral-small-2409 endpoint — scheduled for 30 November 2025 — and directs users to the newer Mistral Small line (Mistral Small 3 onward, and Mistral Small 3.2 as the recommended successor), though the released checkpoint remains available for download.
| Released | 2024-09-17 |
|---|---|
| License | Mistral Research License (MRL) — non-commercial / research use; a separate commercial license is required for production use |
| Weights | Open weights |
| Parameters | 22B (dense) |
| Context | 32k tokens |
| Max output | Not separately published; bounded by the 32k token context window |
| Architecture | Dense decoder-only transformer (loaded as MistralForCausalLM in Hugging Face Transformers), 22B parameters, 32k sequence length, 32768-token vocabulary, instruction-tuned with native function-calling support. Not a mixture-of-experts model. Running the full-precision weights on a single GPU requires at least 44 GB of GPU RAM. |
| Knowledge cutoff | Not published by Mistral AI |
| Modalities | text |
| Status | Retired. Mistral AI scheduled mistral-small-2409 for retirement on its API on 2025-11-30; the recommended successor is Mistral Small 3.2 (the open-weight Small line continued with Mistral Small 3 in January 2025). The Mistral Research License weights remain downloadable on Hugging Face for self-hosting. |
Benchmarks
- MMLU-Pro (5-shot)48.4%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.20 / 1M tokens per 1M tokens |
|---|---|
| Output | $0.60 / 1M tokens per 1M tokens |
Launch list price on Mistral's la Plateforme — about an 80% reduction vs. the previous Mistral Small. The hosted endpoint was scheduled for retirement on 2025-11-30.
Strengths
- Strong cost-efficiency for its size: $0.20/M input and $0.60/M output, roughly 80% cheaper than the previous Mistral Small at launch
- Open-weight: full checkpoint downloadable for self-hosting and fine-tuning under the Mistral Research License
- Improved human alignment, reasoning, and code generation over the earlier Mistral Small
- Native function calling and a 32k-token context window
- Fits as a practical midpoint between Mistral NeMo 12B and Mistral Large 2 for mid-tier workloads
Best for
- High-volume translation and multilingual text processing
- Summarization of long documents within the 32k context
- Sentiment analysis and text classification at scale
- Cost-sensitive production tasks that don't need a frontier general-purpose model
- Local / self-hosted deployment (e.g. via vLLM) for research and non-commercial use
- Function-calling / tool-use applications on a budget
How to access
| Provider | Model ID |
|---|---|
| Mistral AI (la Plateforme) ↗ | mistral-small-2409 |
Mistral Small — every version
The full lineage of the Mistral Small line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Mistral Small 4current | 2026-03-16 | — | Apache-2.0 |
| Mistral Small 3.2 | 2025-06-20 | — | Apache-2.0 |
| Mistral Small 3.1 | 2025-03-17 | — | Open weights |
| Mistral Small 3 | 2025-01-30 | — | Apache-2.0 |
| Mistral Small (24.09) | 2024-09-17 | — | Open weights |
FAQ
What is Mistral Small v24.09?
Mistral Small v24.09 (API name mistral-small-2409) is a 22-billion-parameter, text-only open-weight language model released by Mistral AI on 17 September 2024. It relaunched Mistral's mid-tier "Small" line with better alignment, reasoning, and code, plus an ~80% price cut, sitting between Mistral NeMo 12B and Mistral Large 2.
Is Mistral Small (24.09) still available?
The hosted mistral-small-2409 endpoint on Mistral's API was scheduled for retirement on 30 November 2025, with Mistral Small 3.2 as the recommended successor. The open weights (Mistral-Small-Instruct-2409) remain downloadable from Hugging Face under the Mistral Research License for self-hosting.
How much did Mistral Small (24.09) cost?
At launch on Mistral's la Plateforme it was priced at $0.20 per million input tokens and $0.60 per million output tokens — roughly 80% cheaper than the previous Mistral Small release.
What is the context window and license of Mistral Small (24.09)?
It has a 32k-token context window and is released under the Mistral Research License (MRL), which allows research and non-commercial self-deployment but requires a separate commercial license for production use. Running it on a single GPU needs at least 44 GB of GPU RAM.