DeepSeek-V3.2-Speciale

Name: DeepSeek-V3.2-Speciale
Author: DeepSeek

DeepSeek's reasoning-maxed, open-weight olympiad model that rivals Gemini 3 Pro

Overview

DeepSeek-V3.2-Speciale is the reasoning-maxed sibling of DeepSeek-V3.2, both released by DeepSeek on December 1, 2025 as part of the DeepSeek V3 line. Where the standard V3.2 balances capability with efficiency and adds tool-use during thinking, Speciale is deliberately tuned for raw reasoning power: it is a "thinking-only" model that does not support tool calls and is designed for deep math, coding, and reasoning problems rather than agentic workflows.

Under the hood, DeepSeek-V3.2-Speciale is a Mixture-of-Experts model with 671 billion total parameters and roughly 37 billion active per token (HuggingFace lists 685B once the Multi-Token Prediction module is counted). It is built on DeepSeek Sparse Attention (DSA) combined with Multi-head Latent Attention, the same long-context architecture that lets the V3.2 family process a 128K-token window cheaply. Speciale was trained exclusively on reasoning data with a reduced length penalty during reinforcement learning, and it borrows the dataset and reward method from DeepSeekMath-V2 to push mathematical-proof ability.

The model is open weight under the MIT license and downloadable from HuggingFace. DeepSeek positioned Speciale as a research and community-evaluation release: at launch it was served through a temporary API endpoint that expired on December 15, 2025, at the same pricing as V3.2. DeepSeek reports that V3.2-Speciale rivals Gemini-3.0-Pro on hard reasoning benchmarks and earned gold-medal-level results at the 2025 IMO, CMO, ICPC World Finals, and IOI.

Released	2025-12-01
License	MIT
Weights	Open weights
Parameters	671B total / 37B active (MoE)
Context	128K
Max output	128K
Architecture	Mixture-of-Experts (671B total, ~37B active per token) built on DeepSeek Sparse Attention (DSA) with Multi-head Latent Attention (MLA). Speciale is a high-compute "thinking-only" variant of DeepSeek-V3.2, trained exclusively on reasoning data with a reduced length penalty during reinforcement learning and incorporating the dataset and reward method from DeepSeekMath-V2 to strengthen mathematical proofs. HuggingFace lists the checkpoint as 685B parameters when counting the Multi-Token Prediction (MTP) module.
Knowledge cutoff	Not disclosed
Modalities	Text
Status	available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.28 / 1M tokens (cache miss) per 1M tokens
Cached input	$0.028 / 1M tokens (cache hit) per 1M tokens
Output	$0.42 / 1M tokens per 1M tokens

DeepSeek-V3.2-Speciale was served at the same pricing as DeepSeek-V3.2 via a temporary API endpoint that expired December 15, 2025. Weights remain available under MIT for self-hosting.

Pricing source ↗

Strengths

Gold-medal-level performance on the 2025 IMO, CMO, ICPC World Finals, and IOI competitions
Top-tier scores on hard math and coding benchmarks (96.0% AIME 2025, 99.2% HMMT Feb 2025, 88.7% LiveCodeBench)
Open weights under a permissive MIT license, fully downloadable and self-hostable
DeepSeek Sparse Attention (DSA) delivers efficient 128K long-context reasoning
MoE design activates only ~37B of 671B parameters per token, keeping inference cost low for the capability level
Released at the same low price as V3.2 (far cheaper than comparable frontier reasoning models)

Best for

Solving competition-grade mathematics and writing formal mathematical proofs
Hard algorithmic and competitive-programming problems (ICPC/IOI-style)
Research and academic evaluation of frontier open-weight reasoning models
Long-context reasoning over large documents and codebases (up to 128K tokens)
Self-hosted reasoning deployments where open weights and data control matter
Benchmarking and distillation source for smaller reasoning models

How to access

Provider	Model ID
DeepSeek ↗	`deepseek-v3.2-speciale`
HuggingFace ↗	`deepseek-ai/DeepSeek-V3.2-Speciale`
OpenRouter ↗	`deepseek/deepseek-v3.2-speciale`

DeepSeek V3 — every version

The full lineage of the DeepSeek V3 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
DeepSeek-V3.2current	2025-12-01	—	Open weights
DeepSeek-V3.2-Speciale	2025-12-01	—	Open weights
DeepSeek-V3.2-Exp	2025-09-29	—	Open weights
DeepSeek-V3.1-Terminus	2025-09-22	—	Open weights
DeepSeek-V3.1	2025-08-21	—	Open weights
DeepSeek-V3-0324	2025-03-24	—	Open weights
DeepSeek-V3	2024-12-26	—	Open weights
DeepSeek-V2.5	2024-09-05	—	Open weights
DeepSeek-V2	2024-05	—	Open weights

FAQ

What is DeepSeek-V3.2-Speciale?

DeepSeek-V3.2-Speciale is a reasoning-maxed, open-weight variant of DeepSeek-V3.2, released by DeepSeek on December 1, 2025. It is a 671B-parameter Mixture-of-Experts model (about 37B active per token) built on DeepSeek Sparse Attention. It is a "thinking-only" model with no tool-call support, tuned for deep math, coding, and reasoning tasks.

How does Speciale differ from the standard DeepSeek-V3.2?

Standard V3.2 balances capability and efficiency and integrates tool-use directly into its thinking mode. Speciale is a higher-compute variant trained exclusively on reasoning data with a reduced length penalty during reinforcement learning. It produces longer reasoning chains, scores higher on hard benchmarks, does not support tool calls, and was released as a temporary research endpoint.

Is DeepSeek-V3.2-Speciale open source, and what does it cost?

The model weights are open and released under the permissive MIT license, downloadable from HuggingFace for self-hosting. On DeepSeek's API it was served at the same price as V3.2: about $0.28 per million input tokens (cache miss), $0.028 per million on a cache hit, and $0.42 per million output tokens, through a temporary endpoint that expired December 15, 2025.

How does DeepSeek-V3.2-Speciale perform on benchmarks?

DeepSeek reports it rivals Gemini-3.0-Pro on difficult reasoning workloads. Per the tech report it scores 96.0% on AIME 2025, 99.2% on HMMT Feb 2025, 88.7% on LiveCodeBench, and 85.7% on GPQA Diamond, and it achieved gold-medal-level results at the 2025 IMO, CMO, ICPC World Finals, and IOI.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// DeepSeek V3 — every version

// FAQ