DeepSeek · 2026-04-24 · seismic

DeepSeek V4 — 1.6T Open-Weights MoE Tops LiveCodeBench with MIT License

Item: DeepSeek V4 — 1.6T Open-Weights MoE Tops LiveCodeBench with MIT License
Rating: 5
Author: AI/TLDR

DeepSeek releases V4 today: two MIT-licensed open-weight variants — V4-Pro (1.6T total, 49B active) and V4-Flash (284B total, 13B active), both with 1M-token context. V4-Pro scores 93.5 on LiveCodeBench and Codeforces 3206 in Think Max mode, beating Gemini 3.1 Pro and Claude Opus 4.6 on coding benchmarks.

DeepSeek V4-Pro benchmark performance chart comparing to Claude Opus 4.6 and Gemini 3.1 Pro on coding and reasoning tasks

DeepSeek's new open-weights flagship: two MIT-licensed MoE models with 1M-token context and top-tier coding performance released today.

Key specs

License	MIT
Context window	1M tokens
V4 pro total params	1.6T
V4 pro active params per token	49B
V4 flash total params	284B
V4 flash active params per token	13B
Live code bench (v4 pro think max)	93.5
Codeforces rating (v4 pro think max)	3,206
Imoanswer bench (v4 pro think max)	89.8
Gpqa diamond (v4 pro think max)	90.1
V4 flash api input (cache miss)	$0.14 / MTok
V4 flash api output	$0.28 / MTok
V4 pro api input (cache miss)	$1.74 / MTok
V4 pro api output	$3.48 / MTok
Hn points	468+

What is it?

DeepSeek V4 is an open-weights language model family released April 24, 2026, with two Mixture-of-Experts variants: V4-Pro (1.6T total parameters, 49B activated per token) and V4-Flash (284B total, 13B activated). Both support 1 million tokens of context. Weights are on HuggingFace under MIT; API access is live at api-docs.deepseek.com with model names deepseek-v4-pro and deepseek-v4-flash. Old deepseek-chat and deepseek-reasoner models are deprecated July 24, 2026.

How does it work?

V4 introduces three architectural improvements over V3.2: Hybrid Attention (Compressed Sparse Attention + Heavily Compressed Attention) cuts single-token inference FLOPs to 27% of V3.2's and KV cache to 10%; Manifold-Constrained Hyper-Connections improve residual stream expressiveness; and a Muon Optimizer speeds convergence. The model trained on 32T+ tokens supports three reasoning modes: Non-Think (fastest), Think High, and Think Max (highest quality).

Why does it matter?

V4-Pro's Think Max mode scores 93.5 on LiveCodeBench and Codeforces 3206, ahead of Gemini 3.1 Pro High (91.7, 3052) and Claude Opus 4.6 Max (88.8) on the same benchmarks. The MIT license removes usage restrictions that limit most frontier models, and V4-Flash pricing ($0.14/MTok input, $0.28/MTok output) undercuts most frontier APIs. For teams running coding agents, this is the strongest open-weights option released to date.

Who is it for?

ML engineers building coding agents or needing open-weights frontier-quality reasoning; API users looking for competitive pricing

Try it

huggingface.co/deepseek-ai/DeepSeek-V4-Pro — or via API: model='deepseek-v4-pro' at api-docs.deepseek.com