DeepSeek-V3.2

Name: DeepSeek-V3.2
Author: DeepSeek

GPT-5-level open-weight MoE that bakes thinking into tool-use, with DeepSeek Sparse Attention for cheap long context.

Overview

DeepSeek-V3.2 is the flagship model of DeepSeek's V3 line, released December 1, 2025. It is a Mixture-of-Experts model with 685B total parameters and roughly 37B active per token, published openly under the MIT license alongside an API-only high-compute sibling, DeepSeek-V3.2-Speciale. DeepSeek positions V3.2 as a 'daily driver at GPT-5-level performance' while keeping inference cheap.

The headline technical change is DeepSeek Sparse Attention (DSA), the sparse-attention mechanism first trialled in V3.2-Exp and now productized: it sharply reduces the compute of long-context inference without measurably hurting quality. V3.2 is also DeepSeek's first model to integrate thinking directly into tool-use, and it supports tool calls in both thinking and non-thinking modes — trained with an agentic data-synthesis pipeline spanning 1,800+ environments and 85k+ complex instructions.

V3.2 ships on the DeepSeek app, web chat, and API, and its weights and technical report are public. It is the successor to DeepSeek-V3.2-Exp and the current top of the V3 series, sitting just below the later DeepSeek-V4 line.

Released	2025-12-01
License	MIT
Weights	Open weights
Parameters	685B total / 37B active (MoE)
Context	128K
Max output	64K
Architecture	Mixture-of-Experts transformer with DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that cuts the cost of long-context inference while preserving quality. Ships a single hybrid checkpoint that runs in both thinking and non-thinking modes.
Knowledge cutoff	Not disclosed
Modalities	Text
Status	Available

Benchmarks

Grouped bar chart comparing DeepSeek-V3.2-Speciale and DeepSeek-V3.2-Thinking against GPT-5-High, Claude-4.5-Sonnet, and Gemini-3.0-Pro on reasoning benchmarks (AIME 2025, HMMT 2025, HLE, Codeforces) and agentic benchmarks (SWE Verified, Terminal Bench 2.0, T2 Bench, Tool Decathlon) — DeepSeek-V3.2 reasoning and agentic benchmarks vs GPT-5-High, Claude-4.5-Sonnet, and Gemini-3.0-Pro — DeepSeek

DeepSeek-V3.2 (Thinking and Speciale) benchmark comparison vs GPT-5 High, Gemini-3.0 Pro, and Kimi-K2 Thinking, as published by DeepSeek (numbers in the source are accompanied by per-task thinking-token budgets, e.g. 13k; only the score is given here).

Benchmark	GPT-5 High	Gemini-3.0 Pro	Kimi-K2 Thinking	DeepSeek-V3.2 Thinking	DeepSeek-V3.2 Speciale
AIME 2025	94.6 Acc (%)	95 Acc (%)	94.5 Acc (%)	93.1 Acc (%)	96 Acc (%)
HMMT Feb 2025	88.3 Acc (%)	97.5 Acc (%)	89.4 Acc (%)	92.5 Acc (%)	99.2 Acc (%)
HMMT Nov 2025	89.2 Acc (%)	93.3 Acc (%)	89.2 Acc (%)	90.2 Acc (%)	94.4 Acc (%)
IMOAnswerBench	76 Acc (%)	83.3 Acc (%)	78.6 Acc (%)	78.3 Acc (%)	84.5 Acc (%)
LiveCodeBench	84.5 Acc (%)	90.7 Acc (%)	82.6 Acc (%)	83.3 Acc (%)	88.7 Acc (%)
CodeForces	2537 Rating	2708 Rating	—	2386 Rating	2701 Rating
GPQA Diamond	85.7 Acc (%)	91.9 Acc (%)	84.5 Acc (%)	82.4 Acc (%)	85.7 Acc (%)
HLE	26.3 Acc (%)	37.7 Acc (%)	23.9 Acc (%)	25.1 Acc (%)	30.6 Acc (%)

Comparison source ↗

This model's scores

MMLU-Pro85%
GPQA-Diamond82.4%
SWE-bench Verified70%
Artificial Analysis Intelligence Index33%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.30 per 1M tokens
Cached input	$0.135 per 1M tokens
Output	$0.45 per 1M tokens

Reasoning (deepseek-reasoner) tier; cache-hit input is discounted ~55%. The non-thinking deepseek-chat tier is cheaper still. Prices reflect the Dec 2025 release.

Pricing source ↗

Strengths

Open weights under the permissive MIT license — fully self-hostable
DeepSeek Sparse Attention makes long-context inference markedly cheaper than dense attention
First DeepSeek model to fold reasoning directly into tool-use, in both thinking and non-thinking modes
GPT-5-level general performance at a fraction of frontier-model API pricing
Strong math, coding, and agentic results from training across 1,800+ environments

Best for

Cost-sensitive agentic workflows that need tool-calling plus reasoning
Long-document and large-codebase tasks where sparse attention lowers cost
Self-hosted deployments that require open weights and a permissive license
Math and competitive-programming problem solving
General assistant and coding workloads at GPT-5-class quality

How to access

Provider	Model ID
DeepSeek ↗	`deepseek-chat / deepseek-reasoner (DeepSeek-V3.2)`
OpenRouter ↗	`deepseek/deepseek-v3.2`

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
DeepSeek-V3.2current	2025-12-01	—	Open weights
DeepSeek-V3.2-Speciale	2025-12-01	—	Open weights
DeepSeek-V3.2-Exp	2025-09-29	—	Open weights
DeepSeek-V3.1-Terminus	2025-09-22	—	Open weights
DeepSeek-V3.1	2025-08-21	—	Open weights
DeepSeek-V3-0324	2025-03-24	—	Open weights
DeepSeek-V3	2024-12-26	—	Open weights
DeepSeek-V2.5	2024-09-05	—	Open weights
DeepSeek-V2	2024-05	—	Open weights

FAQ

Is DeepSeek-V3.2 open source?

The weights are released openly under the MIT license, so you can download, run, and fine-tune the model yourself. DeepSeek also published a technical report describing the architecture and training.

What is DeepSeek Sparse Attention (DSA)?

DSA is the fine-grained sparse-attention mechanism at the core of V3.2. It substantially reduces the compute needed for long-context inference while preserving model quality, which is what lets V3.2 run long contexts cheaply compared with dense-attention models.

How does V3.2 differ from V3.2-Speciale?

V3.2 is the balanced 'daily driver' positioned at GPT-5-level performance and available on the app, web, and API. V3.2-Speciale is a high-compute, API-only variant that maxes out reasoning (gold-medal-level results on the 2025 IMO and IOI) but uses far more tokens.

Does V3.2 support tool use and reasoning together?

Yes. V3.2 is DeepSeek's first model to integrate thinking directly into tool-use, and it supports tool calls in both thinking and non-thinking modes. It was trained with an agentic data pipeline covering 1,800+ environments and 85k+ complex instructions.

// Overview

// Benchmarks

This model's scores

// Pricing

// Strengths

// Best for

// How to access

// DeepSeek V3 — every version

// FAQ