Overview
DeepSeek-V3.2-Exp is an experimental open-weight large language model from DeepSeek, released on September 29, 2025 as part of the DeepSeek V3 line. It is built directly on DeepSeek-V3.1-Terminus and positioned by DeepSeek as an intermediate step toward a next-generation architecture rather than a clean-sheet model.
Its headline change is DeepSeek Sparse Attention (DSA): a two-stage attention path where a lightweight "lightning indexer" scores tokens across the context and a fine-grained selector keeps only the most relevant ones for full attention. Layered on the existing 685B-parameter Mixture-of-Experts stack (about 37B active parameters, with Multi-head Latent Attention), DSA cuts the cost of long-context training and inference while DeepSeek reports output quality on par with V3.1-Terminus.
DeepSeek-V3.2-Exp is text-only and ships under the MIT license, with open weights, a technical report, and GPU kernels (TileLang and CUDA) published on GitHub and Hugging Face. At launch DeepSeek made it the default chat/reasoner endpoint and cut API prices by more than 50%.
| Released | 2025-09-29 |
|---|---|
| License | MIT |
| Weights | Open weights |
| Parameters | 685B total (~37B active, MoE) |
| Context | 128K |
| Max output | 64K |
| Architecture | Mixture-of-Experts transformer (256 experts, ~37B active per token) built on the V3/V3.1 stack with Multi-head Latent Attention (MLA), adding DeepSeek Sparse Attention (DSA): a lightweight "lightning indexer" scores context tokens, then fine-grained token selection restricts attention to a top-k subset for cheaper long-context training and inference. |
| Modalities | Text |
| Status | Available |
Benchmarks
- MMLU-Pro85%
- GPQA-Diamond79.9%
- Humanity's Last Exam19.8%
- LiveCodeBench74.1%
- AIME 202589.3%
- HMMT 202583.6%
- Aider-Polyglot74.5%
- BrowseComp40.1%
- SimpleQA97.1%
- SWE Verified67.8%
- SWE-bench Multilingual57.9%
- Terminal-bench37.7%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.28 / 1M tokens (cache miss) per 1M tokens |
|---|---|
| Cached input | $0.028 / 1M tokens (cache hit) per 1M tokens |
| Output | $0.42 / 1M tokens per 1M tokens |
Official DeepSeek API launch pricing (USD), more than 50% below the prior V3.1-Terminus tier. Third-party hosts list close equivalents (OpenRouter ~$0.27 in / $0.41 out).
Strengths
- Introduces DeepSeek Sparse Attention (DSA) for substantially cheaper long-context training and inference
- Maintains benchmark parity with V3.1-Terminus despite the efficiency changes (e.g. MMLU-Pro 85.0, SimpleQA 97.1)
- MIT-licensed open weights with published technical report and open GPU kernels (TileLang + CUDA)
- Aggressive API pricing — more than 50% cheaper than the prior V3.1-Terminus tier
- Strong math and agentic-search scores (AIME 2025 89.3, BrowseComp 40.1)
Best for
- Long-context document, codebase, and transcript analysis where attention cost dominates
- Cost-sensitive production deployments needing a cheap, capable open-weight chat/reasoner model
- Self-hosting on-prem with open weights and open GPU kernels
- Reasoning, math, and competitive-coding tasks
- Agentic / tool-use and web-search workflows (BrowseComp, SWE-bench)
How to access
| Provider | Model ID |
|---|---|
| DeepSeek ↗ | deepseek-chat / deepseek-reasoner |
| OpenRouter ↗ | deepseek/deepseek-v3.2-exp |
DeepSeek V3 — every version
The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| DeepSeek-V3.2current | 2025-12-01 | — | Open weights |
| DeepSeek-V3.2-Speciale | 2025-12-01 | — | Open weights |
| DeepSeek-V3.2-Exp | 2025-09-29 | — | Open weights |
| DeepSeek-V3.1-Terminus | 2025-09-22 | — | Open weights |
| DeepSeek-V3.1 | 2025-08-21 | — | Open weights |
| DeepSeek-V3-0324 | 2025-03-24 | — | Open weights |
| DeepSeek-V3 | 2024-12-26 | — | Open weights |
| DeepSeek-V2.5 | 2024-09-05 | — | Open weights |
| DeepSeek-V2 | 2024-05 | — | Open weights |
FAQ
What is DeepSeek-V3.2-Exp?
It is an experimental open-weight large language model released by DeepSeek on September 29, 2025, built on DeepSeek-V3.1-Terminus. It introduces DeepSeek Sparse Attention (DSA) to make long-context training and inference cheaper while keeping quality on par with V3.1-Terminus.
What is DeepSeek Sparse Attention (DSA)?
DSA is a two-stage attention mechanism: a lightweight "lightning indexer" scores tokens across the context, then a fine-grained selector keeps only the most relevant tokens for full attention. This cuts the compute cost of long sequences with minimal impact on output quality.
Is DeepSeek-V3.2-Exp open source, and what license does it use?
Yes. The weights, technical report, and GPU kernels (TileLang and CUDA) are published on GitHub and Hugging Face under the MIT license, so it can be self-hosted and used commercially.
How much does DeepSeek-V3.2-Exp cost to use?
At launch DeepSeek cut API prices by more than 50%: roughly $0.28 per 1M input tokens (cache miss), about $0.028 per 1M cached input tokens, and $0.42 per 1M output tokens. Third-party hosts like OpenRouter list close equivalents (~$0.27 in / $0.41 out).