Overview
DeepSeek-V3.1 is the August 2025 update to DeepSeek's V3 line, released on 21 August 2025 under an MIT license with open weights on Hugging Face. It keeps the same 671B-parameter Mixture-of-Experts backbone as DeepSeek-V3 (about 37B parameters active per token) and adds a hybrid design: a single model that can answer directly or switch into a chain-of-thought "thinking" mode. On DeepSeek's own API the two behaviours are exposed as deepseek-chat (non-thinking) and deepseek-reasoner (thinking), both sharing a 128K-token context window.
The headline change in DeepSeek-V3.1 is that one model now covers both quick replies and deep reasoning, where the previous generation split this across DeepSeek-V3-0324 and the R1 reasoning model. In thinking mode it reaches reasoning quality close to DeepSeek-R1-0528 while producing answers faster, and DeepSeek extended the long-context training (a 32K phase grown to 630B tokens and a 128K phase to 209B tokens) so the model holds up better over long inputs.
DeepSeek-V3.1 also targets agent workloads. DeepSeek post-trained it for stronger tool calling, multi-step search agents, and code agents, and reports gains on coding and terminal benchmarks over the prior V3 and R1 releases. The model is text-only and ships with FP8 (UE8M0) weights for efficient inference, and the open MIT weights mean it can be self-hosted as well as called through DeepSeek's API and third-party providers.
| Released | 2025-08-21 |
|---|---|
| License | MIT |
| Weights | Open weights |
| Parameters | 671B total / 37B active (MoE) |
| Context | 128K |
| Architecture | Mixture-of-Experts (MoE) transformer with Multi-head Latent Attention (MLA), sparse 37B-active routing, and FP8 (UE8M0) training. Single hybrid model exposes a non-thinking mode (deepseek-chat) and a thinking mode (deepseek-reasoner), toggled at the API/template level. |
| Modalities | Text |
| Status | Available |
Benchmarks
- MMLU-Redux (Thinking)93.7%
- MMLU-Pro (Thinking)84.8%
- GPQA-Diamond (Thinking)80.1%
- AIME 2025 (Thinking)88.4%
- AIME 2024 (Thinking)93.1%
- HMMT 2025 (Thinking)84.2%
- LiveCodeBench (Thinking)74.8%
- Aider-Polyglot (Thinking)76.3%
- SWE-bench Verified (Agent)66%
- SWE-bench Multilingual (Agent)54.5%
- Terminal-bench31.3%
- BrowseComp (Thinking)30%
- Humanity's Last Exam (Thinking)15.9%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.21 / 1M tokens per 1M tokens |
|---|---|
| Output | $0.79 / 1M tokens per 1M tokens |
Pricing for deepseek-chat-v3.1 as listed by OpenRouter. DeepSeek's first-party API has since migrated this model line to newer model names, so check the provider for live rates.
Strengths
- Hybrid think/non-think in one model — pick fast answers or deep reasoning without swapping models
- Open weights under a permissive MIT license, so it can be self-hosted and fine-tuned
- Strong reasoning: ~88 on AIME 2025 and 80 on GPQA-Diamond in thinking mode
- Tuned for agents — better tool calling, code agents, and multi-step search than prior V3/R1 releases
- Large 128K context window with extended long-context training
- Very low API pricing relative to closed frontier models
Best for
- Coding agents and software-engineering tasks (SWE-bench-style fix/patch workflows)
- Hard math and reasoning problems where the thinking mode pays off
- Multi-step research and tool-using/search agents
- Cost-sensitive high-volume chat and generation in non-thinking mode
- Self-hosted deployments that need open weights and a permissive license
- Long-document analysis within the 128K context window
How to access
| Provider | Model ID |
|---|---|
| DeepSeek ↗ | deepseek-chat / deepseek-reasoner |
| OpenRouter ↗ | deepseek/deepseek-chat-v3.1 |
| Hugging Face ↗ | deepseek-ai/DeepSeek-V3.1 |
DeepSeek V3 — every version
The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| DeepSeek-V3.2current | 2025-12-01 | — | Open weights |
| DeepSeek-V3.2-Speciale | 2025-12-01 | — | Open weights |
| DeepSeek-V3.2-Exp | 2025-09-29 | — | Open weights |
| DeepSeek-V3.1-Terminus | 2025-09-22 | — | Open weights |
| DeepSeek-V3.1 | 2025-08-21 | — | Open weights |
| DeepSeek-V3-0324 | 2025-03-24 | — | Open weights |
| DeepSeek-V3 | 2024-12-26 | — | Open weights |
| DeepSeek-V2.5 | 2024-09-05 | — | Open weights |
| DeepSeek-V2 | 2024-05 | — | Open weights |
FAQ
Is DeepSeek-V3.1 open source?
The weights are openly released on Hugging Face under an MIT license, so you can self-host, fine-tune, and use the model commercially. It is open-weights; DeepSeek does not publish the full training dataset.
What is hybrid thinking mode in DeepSeek-V3.1?
DeepSeek-V3.1 is a single model with two modes. A non-thinking mode (deepseek-chat) answers directly for speed, and a thinking mode (deepseek-reasoner) produces chain-of-thought reasoning for harder problems. Both share the same 128K-token context window.
How big is DeepSeek-V3.1 and how much context does it support?
It is a 671B-parameter Mixture-of-Experts model that activates about 37B parameters per token, with a 128K-token context window. It is text-only and uses Multi-head Latent Attention with FP8 (UE8M0) weights.
When was DeepSeek-V3.1 released?
DeepSeek-V3.1 was released on 21 August 2025 as the hybrid-reasoning update to the DeepSeek-V3 line.