AI/TLDR

DeepSeek-V3.2

GPT-5-level open-weight MoE that bakes thinking into tool-use, with DeepSeek Sparse Attention for cheap long context.

Overview

DeepSeek-V3.2 is the flagship model of DeepSeek's V3 line, released December 1, 2025. It is a Mixture-of-Experts model with 685B total parameters and roughly 37B active per token, published openly under the MIT license alongside an API-only high-compute sibling, DeepSeek-V3.2-Speciale. DeepSeek positions V3.2 as a 'daily driver at GPT-5-level performance' while keeping inference cheap.

The headline technical change is DeepSeek Sparse Attention (DSA), the sparse-attention mechanism first trialled in V3.2-Exp and now productized: it sharply reduces the compute of long-context inference without measurably hurting quality. V3.2 is also DeepSeek's first model to integrate thinking directly into tool-use, and it supports tool calls in both thinking and non-thinking modes — trained with an agentic data-synthesis pipeline spanning 1,800+ environments and 85k+ complex instructions.

V3.2 ships on the DeepSeek app, web chat, and API, and its weights and technical report are public. It is the successor to DeepSeek-V3.2-Exp and the current top of the V3 series, sitting just below the later DeepSeek-V4 line.

Released2025-12-01
LicenseMIT
WeightsOpen weights
Parameters685B total / 37B active (MoE)
Context128K
Max output64K
ArchitectureMixture-of-Experts transformer with DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that cuts the cost of long-context inference while preserving quality. Ships a single hybrid checkpoint that runs in both thinking and non-thinking modes.
Knowledge cutoffNot disclosed
ModalitiesText
StatusAvailable

Benchmarks

Grouped bar chart comparing DeepSeek-V3.2-Speciale and DeepSeek-V3.2-Thinking against GPT-5-High, Claude-4.5-Sonnet, and Gemini-3.0-Pro on reasoning benchmarks (AIME 2025, HMMT 2025, HLE, Codeforces) and agentic benchmarks (SWE Verified, Terminal Bench 2.0, T2 Bench, Tool Decathlon)
DeepSeek-V3.2 reasoning and agentic benchmarks vs GPT-5-High, Claude-4.5-Sonnet, and Gemini-3.0-Pro — DeepSeek
Table comparing tool-use/agentic benchmarks (Tau-squared-Bench, MCP-Universe, MCP-Mark, Tool-Decathlon) for Claude-4.5-Sonnet, GPT-5 High, Gemini-3.0 Pro, Kimi-K2 Thinking, MiniMax M2, and DeepSeek-V3.2 Thinking
DeepSeek-V3.2 ToolUse benchmark comparison against five named models — DeepSeek

DeepSeek-V3.2 (Thinking and Speciale) benchmark comparison vs GPT-5 High, Gemini-3.0 Pro, and Kimi-K2 Thinking, as published by DeepSeek (numbers in the source are accompanied by per-task thinking-token budgets, e.g. 13k; only the score is given here).

BenchmarkGPT-5 HighGemini-3.0 ProKimi-K2 ThinkingDeepSeek-V3.2 ThinkingDeepSeek-V3.2 Speciale
AIME 202594.6 Acc (%)95 Acc (%)94.5 Acc (%)93.1 Acc (%)96 Acc (%)
HMMT Feb 202588.3 Acc (%)97.5 Acc (%)89.4 Acc (%)92.5 Acc (%)99.2 Acc (%)
HMMT Nov 202589.2 Acc (%)93.3 Acc (%)89.2 Acc (%)90.2 Acc (%)94.4 Acc (%)
IMOAnswerBench76 Acc (%)83.3 Acc (%)78.6 Acc (%)78.3 Acc (%)84.5 Acc (%)
LiveCodeBench84.5 Acc (%)90.7 Acc (%)82.6 Acc (%)83.3 Acc (%)88.7 Acc (%)
CodeForces2537 Rating2708 Rating2386 Rating2701 Rating
GPQA Diamond85.7 Acc (%)91.9 Acc (%)84.5 Acc (%)82.4 Acc (%)85.7 Acc (%)
HLE26.3 Acc (%)37.7 Acc (%)23.9 Acc (%)25.1 Acc (%)30.6 Acc (%)

Comparison source ↗

This model's scores

  1. MMLU-Pro85%
  2. GPQA-Diamond82.4%
  3. SWE-bench Verified70%
  4. Artificial Analysis Intelligence Index33%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.30 per 1M tokens
Cached input$0.135 per 1M tokens
Output$0.45 per 1M tokens

Reasoning (deepseek-reasoner) tier; cache-hit input is discounted ~55%. The non-thinking deepseek-chat tier is cheaper still. Prices reflect the Dec 2025 release.

Pricing source ↗

Strengths

  • Open weights under the permissive MIT license — fully self-hostable
  • DeepSeek Sparse Attention makes long-context inference markedly cheaper than dense attention
  • First DeepSeek model to fold reasoning directly into tool-use, in both thinking and non-thinking modes
  • GPT-5-level general performance at a fraction of frontier-model API pricing
  • Strong math, coding, and agentic results from training across 1,800+ environments

Best for

  • Cost-sensitive agentic workflows that need tool-calling plus reasoning
  • Long-document and large-codebase tasks where sparse attention lowers cost
  • Self-hosted deployments that require open weights and a permissive license
  • Math and competitive-programming problem solving
  • General assistant and coding workloads at GPT-5-class quality

How to access

ProviderModel ID
DeepSeek ↗deepseek-chat / deepseek-reasoner (DeepSeek-V3.2)
OpenRouter ↗deepseek/deepseek-v3.2

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
DeepSeek-V3.2current2025-12-01Open weights
DeepSeek-V3.2-Speciale2025-12-01Open weights
DeepSeek-V3.2-Exp2025-09-29Open weights
DeepSeek-V3.1-Terminus2025-09-22Open weights
DeepSeek-V3.12025-08-21Open weights
DeepSeek-V3-03242025-03-24Open weights
DeepSeek-V32024-12-26Open weights
DeepSeek-V2.52024-09-05Open weights
DeepSeek-V22024-05Open weights

FAQ

Is DeepSeek-V3.2 open source?

The weights are released openly under the MIT license, so you can download, run, and fine-tune the model yourself. DeepSeek also published a technical report describing the architecture and training.

What is DeepSeek Sparse Attention (DSA)?

DSA is the fine-grained sparse-attention mechanism at the core of V3.2. It substantially reduces the compute needed for long-context inference while preserving model quality, which is what lets V3.2 run long contexts cheaply compared with dense-attention models.

How does V3.2 differ from V3.2-Speciale?

V3.2 is the balanced 'daily driver' positioned at GPT-5-level performance and available on the app, web, and API. V3.2-Speciale is a high-compute, API-only variant that maxes out reasoning (gold-medal-level results on the 2025 IMO and IOI) but uses far more tokens.

Does V3.2 support tool use and reasoning together?

Yes. V3.2 is DeepSeek's first model to integrate thinking directly into tool-use, and it supports tool calls in both thinking and non-thinking modes. It was trained with an agentic data pipeline covering 1,800+ environments and 85k+ complex instructions.