AI/TLDR

DeepSeek-V3.2-Exp

Experimental open-weight MoE that debuts DeepSeek Sparse Attention to slash long-context cost while matching V3.1-Terminus quality.

Overview

DeepSeek-V3.2-Exp is an experimental open-weight large language model from DeepSeek, released on September 29, 2025 as part of the DeepSeek V3 line. It is built directly on DeepSeek-V3.1-Terminus and positioned by DeepSeek as an intermediate step toward a next-generation architecture rather than a clean-sheet model.

Its headline change is DeepSeek Sparse Attention (DSA): a two-stage attention path where a lightweight "lightning indexer" scores tokens across the context and a fine-grained selector keeps only the most relevant ones for full attention. Layered on the existing 685B-parameter Mixture-of-Experts stack (about 37B active parameters, with Multi-head Latent Attention), DSA cuts the cost of long-context training and inference while DeepSeek reports output quality on par with V3.1-Terminus.

DeepSeek-V3.2-Exp is text-only and ships under the MIT license, with open weights, a technical report, and GPU kernels (TileLang and CUDA) published on GitHub and Hugging Face. At launch DeepSeek made it the default chat/reasoner endpoint and cut API prices by more than 50%.

Released2025-09-29
LicenseMIT
WeightsOpen weights
Parameters685B total (~37B active, MoE)
Context128K
Max output64K
ArchitectureMixture-of-Experts transformer (256 experts, ~37B active per token) built on the V3/V3.1 stack with Multi-head Latent Attention (MLA), adding DeepSeek Sparse Attention (DSA): a lightweight "lightning indexer" scores context tokens, then fine-grained token selection restricts attention to a top-k subset for cheaper long-context training and inference.
ModalitiesText
StatusAvailable

Benchmarks

  1. MMLU-Pro85%
  2. GPQA-Diamond79.9%
  3. Humanity's Last Exam19.8%
  4. LiveCodeBench74.1%
  5. AIME 202589.3%
  6. HMMT 202583.6%
  7. Aider-Polyglot74.5%
  8. BrowseComp40.1%
  9. SimpleQA97.1%
  10. SWE Verified67.8%
  11. SWE-bench Multilingual57.9%
  12. Terminal-bench37.7%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.28 / 1M tokens (cache miss) per 1M tokens
Cached input$0.028 / 1M tokens (cache hit) per 1M tokens
Output$0.42 / 1M tokens per 1M tokens

Official DeepSeek API launch pricing (USD), more than 50% below the prior V3.1-Terminus tier. Third-party hosts list close equivalents (OpenRouter ~$0.27 in / $0.41 out).

Pricing source ↗

Strengths

  • Introduces DeepSeek Sparse Attention (DSA) for substantially cheaper long-context training and inference
  • Maintains benchmark parity with V3.1-Terminus despite the efficiency changes (e.g. MMLU-Pro 85.0, SimpleQA 97.1)
  • MIT-licensed open weights with published technical report and open GPU kernels (TileLang + CUDA)
  • Aggressive API pricing — more than 50% cheaper than the prior V3.1-Terminus tier
  • Strong math and agentic-search scores (AIME 2025 89.3, BrowseComp 40.1)

Best for

  • Long-context document, codebase, and transcript analysis where attention cost dominates
  • Cost-sensitive production deployments needing a cheap, capable open-weight chat/reasoner model
  • Self-hosting on-prem with open weights and open GPU kernels
  • Reasoning, math, and competitive-coding tasks
  • Agentic / tool-use and web-search workflows (BrowseComp, SWE-bench)

How to access

ProviderModel ID
DeepSeek ↗deepseek-chat / deepseek-reasoner
OpenRouter ↗deepseek/deepseek-v3.2-exp

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
DeepSeek-V3.2current2025-12-01Open weights
DeepSeek-V3.2-Speciale2025-12-01Open weights
DeepSeek-V3.2-Exp2025-09-29Open weights
DeepSeek-V3.1-Terminus2025-09-22Open weights
DeepSeek-V3.12025-08-21Open weights
DeepSeek-V3-03242025-03-24Open weights
DeepSeek-V32024-12-26Open weights
DeepSeek-V2.52024-09-05Open weights
DeepSeek-V22024-05Open weights

FAQ

What is DeepSeek-V3.2-Exp?

It is an experimental open-weight large language model released by DeepSeek on September 29, 2025, built on DeepSeek-V3.1-Terminus. It introduces DeepSeek Sparse Attention (DSA) to make long-context training and inference cheaper while keeping quality on par with V3.1-Terminus.

What is DeepSeek Sparse Attention (DSA)?

DSA is a two-stage attention mechanism: a lightweight "lightning indexer" scores tokens across the context, then a fine-grained selector keeps only the most relevant tokens for full attention. This cuts the compute cost of long sequences with minimal impact on output quality.

Is DeepSeek-V3.2-Exp open source, and what license does it use?

Yes. The weights, technical report, and GPU kernels (TileLang and CUDA) are published on GitHub and Hugging Face under the MIT license, so it can be self-hosted and used commercially.

How much does DeepSeek-V3.2-Exp cost to use?

At launch DeepSeek cut API prices by more than 50%: roughly $0.28 per 1M input tokens (cache miss), about $0.028 per 1M cached input tokens, and $0.42 per 1M output tokens. Third-party hosts like OpenRouter list close equivalents (~$0.27 in / $0.41 out).