AI/TLDR

DeepSeek-V3.2-Speciale

DeepSeek's reasoning-maxed, open-weight olympiad model that rivals Gemini 3 Pro

Overview

DeepSeek-V3.2-Speciale is the reasoning-maxed sibling of DeepSeek-V3.2, both released by DeepSeek on December 1, 2025 as part of the DeepSeek V3 line. Where the standard V3.2 balances capability with efficiency and adds tool-use during thinking, Speciale is deliberately tuned for raw reasoning power: it is a "thinking-only" model that does not support tool calls and is designed for deep math, coding, and reasoning problems rather than agentic workflows.

Under the hood, DeepSeek-V3.2-Speciale is a Mixture-of-Experts model with 671 billion total parameters and roughly 37 billion active per token (HuggingFace lists 685B once the Multi-Token Prediction module is counted). It is built on DeepSeek Sparse Attention (DSA) combined with Multi-head Latent Attention, the same long-context architecture that lets the V3.2 family process a 128K-token window cheaply. Speciale was trained exclusively on reasoning data with a reduced length penalty during reinforcement learning, and it borrows the dataset and reward method from DeepSeekMath-V2 to push mathematical-proof ability.

The model is open weight under the MIT license and downloadable from HuggingFace. DeepSeek positioned Speciale as a research and community-evaluation release: at launch it was served through a temporary API endpoint that expired on December 15, 2025, at the same pricing as V3.2. DeepSeek reports that V3.2-Speciale rivals Gemini-3.0-Pro on hard reasoning benchmarks and earned gold-medal-level results at the 2025 IMO, CMO, ICPC World Finals, and IOI.

Released2025-12-01
LicenseMIT
WeightsOpen weights
Parameters671B total / 37B active (MoE)
Context128K
Max output128K
ArchitectureMixture-of-Experts (671B total, ~37B active per token) built on DeepSeek Sparse Attention (DSA) with Multi-head Latent Attention (MLA). Speciale is a high-compute "thinking-only" variant of DeepSeek-V3.2, trained exclusively on reasoning data with a reduced length penalty during reinforcement learning and incorporating the dataset and reward method from DeepSeekMath-V2 to strengthen mathematical proofs. HuggingFace lists the checkpoint as 685B parameters when counting the Multi-Token Prediction (MTP) module.
Knowledge cutoffNot disclosed
ModalitiesText
Statusavailable

Benchmarks

  1. AIME 2025 (Pass@1)96%
  2. HMMT Feb 2025 (Pass@1)99.2%
  3. HMMT Nov 2025 (Pass@1)94.4%
  4. IMOAnswerBench (Pass@1)84.5%
  5. LiveCodeBench (Pass@1-COT)88.7%
  6. GPQA Diamond (Pass@1)85.7%
  7. Humanity's Last Exam (HLE, Pass@1)30.6%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.28 / 1M tokens (cache miss) per 1M tokens
Cached input$0.028 / 1M tokens (cache hit) per 1M tokens
Output$0.42 / 1M tokens per 1M tokens

DeepSeek-V3.2-Speciale was served at the same pricing as DeepSeek-V3.2 via a temporary API endpoint that expired December 15, 2025. Weights remain available under MIT for self-hosting.

Pricing source ↗

Strengths

  • Gold-medal-level performance on the 2025 IMO, CMO, ICPC World Finals, and IOI competitions
  • Top-tier scores on hard math and coding benchmarks (96.0% AIME 2025, 99.2% HMMT Feb 2025, 88.7% LiveCodeBench)
  • Open weights under a permissive MIT license, fully downloadable and self-hostable
  • DeepSeek Sparse Attention (DSA) delivers efficient 128K long-context reasoning
  • MoE design activates only ~37B of 671B parameters per token, keeping inference cost low for the capability level
  • Released at the same low price as V3.2 (far cheaper than comparable frontier reasoning models)

Best for

  • Solving competition-grade mathematics and writing formal mathematical proofs
  • Hard algorithmic and competitive-programming problems (ICPC/IOI-style)
  • Research and academic evaluation of frontier open-weight reasoning models
  • Long-context reasoning over large documents and codebases (up to 128K tokens)
  • Self-hosted reasoning deployments where open weights and data control matter
  • Benchmarking and distillation source for smaller reasoning models

How to access

ProviderModel ID
DeepSeek ↗deepseek-v3.2-speciale
HuggingFace ↗deepseek-ai/DeepSeek-V3.2-Speciale
OpenRouter ↗deepseek/deepseek-v3.2-speciale

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
DeepSeek-V3.2current2025-12-01Open weights
DeepSeek-V3.2-Speciale2025-12-01Open weights
DeepSeek-V3.2-Exp2025-09-29Open weights
DeepSeek-V3.1-Terminus2025-09-22Open weights
DeepSeek-V3.12025-08-21Open weights
DeepSeek-V3-03242025-03-24Open weights
DeepSeek-V32024-12-26Open weights
DeepSeek-V2.52024-09-05Open weights
DeepSeek-V22024-05Open weights

FAQ

What is DeepSeek-V3.2-Speciale?

DeepSeek-V3.2-Speciale is a reasoning-maxed, open-weight variant of DeepSeek-V3.2, released by DeepSeek on December 1, 2025. It is a 671B-parameter Mixture-of-Experts model (about 37B active per token) built on DeepSeek Sparse Attention. It is a "thinking-only" model with no tool-call support, tuned for deep math, coding, and reasoning tasks.

How does Speciale differ from the standard DeepSeek-V3.2?

Standard V3.2 balances capability and efficiency and integrates tool-use directly into its thinking mode. Speciale is a higher-compute variant trained exclusively on reasoning data with a reduced length penalty during reinforcement learning. It produces longer reasoning chains, scores higher on hard benchmarks, does not support tool calls, and was released as a temporary research endpoint.

Is DeepSeek-V3.2-Speciale open source, and what does it cost?

The model weights are open and released under the permissive MIT license, downloadable from HuggingFace for self-hosting. On DeepSeek's API it was served at the same price as V3.2: about $0.28 per million input tokens (cache miss), $0.028 per million on a cache hit, and $0.42 per million output tokens, through a temporary endpoint that expired December 15, 2025.

How does DeepSeek-V3.2-Speciale perform on benchmarks?

DeepSeek reports it rivals Gemini-3.0-Pro on difficult reasoning workloads. Per the tech report it scores 96.0% on AIME 2025, 99.2% on HMMT Feb 2025, 88.7% on LiveCodeBench, and 85.7% on GPQA Diamond, and it achieved gold-medal-level results at the 2025 IMO, CMO, ICPC World Finals, and IOI.