AI/TLDR

DeepSeek-V3.1

One open-weights model, two modes — fast answers or deep reasoning

Overview

DeepSeek-V3.1 is the August 2025 update to DeepSeek's V3 line, released on 21 August 2025 under an MIT license with open weights on Hugging Face. It keeps the same 671B-parameter Mixture-of-Experts backbone as DeepSeek-V3 (about 37B parameters active per token) and adds a hybrid design: a single model that can answer directly or switch into a chain-of-thought "thinking" mode. On DeepSeek's own API the two behaviours are exposed as deepseek-chat (non-thinking) and deepseek-reasoner (thinking), both sharing a 128K-token context window.

The headline change in DeepSeek-V3.1 is that one model now covers both quick replies and deep reasoning, where the previous generation split this across DeepSeek-V3-0324 and the R1 reasoning model. In thinking mode it reaches reasoning quality close to DeepSeek-R1-0528 while producing answers faster, and DeepSeek extended the long-context training (a 32K phase grown to 630B tokens and a 128K phase to 209B tokens) so the model holds up better over long inputs.

DeepSeek-V3.1 also targets agent workloads. DeepSeek post-trained it for stronger tool calling, multi-step search agents, and code agents, and reports gains on coding and terminal benchmarks over the prior V3 and R1 releases. The model is text-only and ships with FP8 (UE8M0) weights for efficient inference, and the open MIT weights mean it can be self-hosted as well as called through DeepSeek's API and third-party providers.

Released2025-08-21
LicenseMIT
WeightsOpen weights
Parameters671B total / 37B active (MoE)
Context128K
ArchitectureMixture-of-Experts (MoE) transformer with Multi-head Latent Attention (MLA), sparse 37B-active routing, and FP8 (UE8M0) training. Single hybrid model exposes a non-thinking mode (deepseek-chat) and a thinking mode (deepseek-reasoner), toggled at the API/template level.
ModalitiesText
StatusAvailable

Benchmarks

  1. MMLU-Redux (Thinking)93.7%
  2. MMLU-Pro (Thinking)84.8%
  3. GPQA-Diamond (Thinking)80.1%
  4. AIME 2025 (Thinking)88.4%
  5. AIME 2024 (Thinking)93.1%
  6. HMMT 2025 (Thinking)84.2%
  7. LiveCodeBench (Thinking)74.8%
  8. Aider-Polyglot (Thinking)76.3%
  9. SWE-bench Verified (Agent)66%
  10. SWE-bench Multilingual (Agent)54.5%
  11. Terminal-bench31.3%
  12. BrowseComp (Thinking)30%
  13. Humanity's Last Exam (Thinking)15.9%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.21 / 1M tokens per 1M tokens
Output$0.79 / 1M tokens per 1M tokens

Pricing for deepseek-chat-v3.1 as listed by OpenRouter. DeepSeek's first-party API has since migrated this model line to newer model names, so check the provider for live rates.

Pricing source ↗

Strengths

  • Hybrid think/non-think in one model — pick fast answers or deep reasoning without swapping models
  • Open weights under a permissive MIT license, so it can be self-hosted and fine-tuned
  • Strong reasoning: ~88 on AIME 2025 and 80 on GPQA-Diamond in thinking mode
  • Tuned for agents — better tool calling, code agents, and multi-step search than prior V3/R1 releases
  • Large 128K context window with extended long-context training
  • Very low API pricing relative to closed frontier models

Best for

  • Coding agents and software-engineering tasks (SWE-bench-style fix/patch workflows)
  • Hard math and reasoning problems where the thinking mode pays off
  • Multi-step research and tool-using/search agents
  • Cost-sensitive high-volume chat and generation in non-thinking mode
  • Self-hosted deployments that need open weights and a permissive license
  • Long-document analysis within the 128K context window

How to access

ProviderModel ID
DeepSeek ↗deepseek-chat / deepseek-reasoner
OpenRouter ↗deepseek/deepseek-chat-v3.1
Hugging Face ↗deepseek-ai/DeepSeek-V3.1

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
DeepSeek-V3.2current2025-12-01Open weights
DeepSeek-V3.2-Speciale2025-12-01Open weights
DeepSeek-V3.2-Exp2025-09-29Open weights
DeepSeek-V3.1-Terminus2025-09-22Open weights
DeepSeek-V3.12025-08-21Open weights
DeepSeek-V3-03242025-03-24Open weights
DeepSeek-V32024-12-26Open weights
DeepSeek-V2.52024-09-05Open weights
DeepSeek-V22024-05Open weights

FAQ

Is DeepSeek-V3.1 open source?

The weights are openly released on Hugging Face under an MIT license, so you can self-host, fine-tune, and use the model commercially. It is open-weights; DeepSeek does not publish the full training dataset.

What is hybrid thinking mode in DeepSeek-V3.1?

DeepSeek-V3.1 is a single model with two modes. A non-thinking mode (deepseek-chat) answers directly for speed, and a thinking mode (deepseek-reasoner) produces chain-of-thought reasoning for harder problems. Both share the same 128K-token context window.

How big is DeepSeek-V3.1 and how much context does it support?

It is a 671B-parameter Mixture-of-Experts model that activates about 37B parameters per token, with a 128K-token context window. It is text-only and uses Multi-head Latent Attention with FP8 (UE8M0) weights.

When was DeepSeek-V3.1 released?

DeepSeek-V3.1 was released on 21 August 2025 as the hybrid-reasoning update to the DeepSeek-V3 line.