AI/TLDR

Ministral 3 (3B / 8B / 14B)

Mistral's vision-capable, Apache 2.0 edge models that run on a laptop, with 256K context and frontier reasoning for their weight class.

Overview

Ministral 3 is Mistral AI's family of small, open-weight edge models released on December 2, 2025, alongside the larger Mistral Large 3. It comes in three dense sizes — Ministral 3 3B, 8B, and 14B — and each size ships in base, instruct, and reasoning variants under the permissive Apache 2.0 license. All of them are vision-capable, pairing a language model with a compact 0.4B vision encoder, and all support a 256K-token (262,144) context window.

The line is built to run locally rather than in the cloud. Mistral positions Ministral 3 for laptops, smartphones, drones, and embedded systems: the 3B is small enough to run 100% in the browser on WebGPU and fits in about 8GB of VRAM in FP8, the 8B targets single-GPU edge deployment at roughly 12GB, and the 14B fits in around 32GB of VRAM in BF16. The models are multilingual across dozens of languages and offer native function calling and JSON output for agentic use.

Despite their size, the reasoning variants post strong numbers for their weight class — Ministral 3 14B Reasoning scores 0.850 on AIME 2025. The models are available as open weights on Hugging Face and through hosted APIs on Mistral AI Studio / La Plateforme, Amazon Bedrock, Azure AI Foundry, OpenRouter, and other providers, with Mistral's own API priced from $0.10 per million tokens for the 3B.

Released2025-12-02
LicenseApache 2.0
WeightsOpen weights
Parameters3B, 8B, 14B (dense)
Context256K
ArchitectureDense transformer (no MoE). Each model pairs a language model with a small vision encoder: 3B = 3.4B language + 0.4B vision; 8B = 8.4B + 0.4B; 14B = 13.5B + 0.4B. Every size ships in base, instruct, and reasoning variants, and all are vision-capable. Sized for edge: the 3B fits in roughly 8GB of VRAM in FP8 (small enough to run in-browser on WebGPU), the 8B in ~12GB FP8, and the 14B in ~32GB BF16 (under 24GB when quantized).
Knowledge cutoffNot disclosed by Mistral AI
ModalitiesText, Vision
StatusAvailable

Benchmarks

  1. AIME 2025 (14B Reasoning)0.85%
  2. AIME 2024 (14B Reasoning)0.898%
  3. GPQA Diamond (14B Reasoning)0.712%
  4. LiveCodeBench (14B Reasoning)0.646%
  5. MATH Maj@1 (14B Instruct)0.904%
  6. Arena Hard (14B Instruct)0.551%
  7. WildBench (14B Instruct)68.5%
  8. MMLU 5-shot (14B Base)0.794%
  9. Multilingual MMLU (14B Base)0.742%
  10. MATH Maj@1 (8B Instruct)0.876%
  11. MATH Maj@1 (3B Instruct)0.83%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.10 (3B) / $0.15 (8B) / $0.20 (14B) per 1M tokens per 1M tokens
Output$0.10 (3B) / $0.15 (8B) / $0.20 (14B) per 1M tokens per 1M tokens

Mistral AI La Plateforme list prices; input and output are the same rate for each size. Open weights are free to self-host under Apache 2.0. Hosted prices vary by provider (e.g. OpenRouter lists the 14B at $0.20 in / $0.20 out per 1M).

Pricing source ↗

Strengths

  • Apache 2.0 open weights — free to use, modify, and self-host for commercial and non-commercial work
  • Runs on-device: the 3B fits in ~8GB VRAM (and in-browser on WebGPU), the 8B on a single GPU, the 14B in ~32GB BF16
  • Vision-capable across the whole family via a 0.4B vision encoder
  • 256K-token context window, large for models this small
  • Strong reasoning for its weight class — 14B Reasoning hits 0.850 on AIME 2025
  • Three sizes and three variants (base / instruct / reasoning) give a wide deployment range
  • Native function calling and JSON output for agentic and tool-using workflows
  • Multilingual across dozens of languages

Best for

  • On-device and offline AI assistants on laptops, phones, and embedded hardware
  • Privacy-sensitive deployments where data must stay local
  • Low-latency edge inference on drones, robots, and IoT devices
  • Cost-efficient document and image understanding with the vision encoder
  • Math and coding tasks using the reasoning variants
  • Self-hosted agents with function calling and structured JSON output
  • Fine-tuning a small open-weight base model for a specialized domain
  • In-browser demos and apps running the 3B on WebGPU

How to access

ProviderModel ID
Mistral AI (La Plateforme) ↗ministral-3-14b-2512
OpenRouter ↗mistralai/ministral-14b-2512
Amazon Bedrock ↗
Hugging Face (weights) ↗mistralai/Ministral-3-14B-Instruct-2512

Ministral — every version

The full lineage of the Ministral line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Ministral 3 (3B / 8B / 14B)current2025-12-02Apache-2.0
Ministral 3B / 8B (24.10)2024-10-09Open weights

FAQ

What is Ministral 3?

Ministral 3 is Mistral AI's family of small, open-weight edge models released on December 2, 2025. It comes in 3B, 8B, and 14B dense sizes, each with base, instruct, and reasoning variants, all vision-capable and licensed under Apache 2.0 with a 256K-token context window.

Is Ministral 3 open source and free?

The weights are open and released under the Apache 2.0 license, so you can download, modify, and self-host all sizes for free, including commercially. If you use a hosted API instead, you pay per token — for example, Mistral lists $0.10/$0.15/$0.20 per million tokens for the 3B/8B/14B.

Can Ministral 3 run on a laptop or on-device?

Yes — that is its main design goal. The 3B fits in roughly 8GB of VRAM in FP8 and can run in a browser on WebGPU, the 8B targets single-GPU edge deployment at about 12GB, and the 14B fits in around 32GB of VRAM in BF16 (less when quantized).

How good is Ministral 3 at reasoning and math?

Strong for its weight class. The 14B reasoning variant scores 0.850 on AIME 2025, 0.898 on AIME 2024, 0.712 on GPQA Diamond, and 0.646 on LiveCodeBench, per Mistral's Hugging Face model card.