Overview
DeepSeek-Coder-V2 is an open-weight, code-specialized Mixture-of-Experts model from DeepSeek, released on June 17, 2024. It comes in two sizes: a flagship with 236B total parameters and 21B active, and a smaller DeepSeek-Coder-V2-Lite with 16B total and 2.4B active. Both ship as base and instruction-tuned checkpoints, and were further pre-trained from an intermediate DeepSeek-V2 checkpoint on an additional 6 trillion tokens of code and math data.
Compared with the original DeepSeek Coder, DeepSeek-Coder-V2 expands programming-language coverage from 86 to 338 languages and extends the context window from 16K to 128K tokens. DeepSeek positions it as the first open-source model to break the barrier of closed-source code intelligence: on coding and math benchmarks it reaches performance comparable to GPT-4-Turbo and reportedly edges out Claude 3 Opus and Gemini 1.5 Pro of that era.
On reported benchmarks the 236B Instruct model scores 90.2% on HumanEval, 76.2% on MBPP+, 43.4% on LiveCodeBench, 75.7% on MATH, and 94.9% on GSM8K, while keeping solid general-language ability at 79.2% MMLU. The weights are downloadable from Hugging Face under an MIT code license plus a DeepSeek Model License that permits commercial use; the hosted API endpoint was later merged into DeepSeek-V2.5 in September 2024.
| Released | 2024-06-17 |
|---|---|
| License | MIT (code) + DeepSeek Model License — commercial use permitted |
| Weights | Open weights |
| Parameters | 236B total · 21B active (Lite: 16B total · 2.4B active) |
| Context | 128K |
| Max output | Undisclosed |
| Architecture | Mixture-of-Experts (DeepSeekMoE), further pre-trained from a DeepSeek-V2 checkpoint on an extra 6 trillion tokens. |
| Knowledge cutoff | November 2023 |
| Modalities | Text |
| Status | Generally available |
Benchmarks
- HumanEval (DeepSeek-Coder-V2-Instruct)90.2%
- MBPP+ (EvalPlus)76.2%
- LiveCodeBench43.4%
- SWE-Bench12.7%
- Aider (code editing)73.7%
- MATH75.7%
- GSM8K94.9%
- MMLU79.2%
- Arena-Hard65%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | Free (open weights) self-hosted |
|---|---|
| Output | Free (open weights) self-hosted |
Weights are downloadable from Hugging Face for self-hosting; the original hosted deepseek-coder API endpoint was merged into DeepSeek-V2.5 in September 2024.
Strengths
- GPT-4-Turbo-class coding performance from a fully open-weight model (90.2% HumanEval)
- Broad language coverage — 338 programming languages, up from 86 in DeepSeek Coder v1
- 128K-token context for whole-repository and long-file reasoning
- Strong mathematical reasoning: 75.7% MATH and 94.9% GSM8K
- Two sizes — a 236B/21B-active flagship and a lightweight 16B/2.4B-active Lite — plus base and instruct checkpoints
- Permissive licensing (MIT code + commercial-use model license) for self-hosting
Best for
- Code generation and autocompletion across many languages
- Code repair and bug fixing
- Mathematical and algorithmic reasoning
- Repository-scale code understanding using the 128K context
- Self-hosted, on-prem coding assistants where open weights are required
- Cost-efficient open-weight alternative to closed coding APIs
How to access
| Provider | Model ID |
|---|---|
| Hugging Face (download weights) ↗ | deepseek-ai/DeepSeek-Coder-V2-Instruct |
| DeepSeek Platform (merged into DeepSeek-V2.5) ↗ | deepseek-coder |
DeepSeek Coder — every version
The full lineage of the DeepSeek Coder line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| DeepSeek-Coder-V2current | 2024-06-17 | — | Open weights |
| DeepSeek Coder | 2023-11-02 | — | Open weights |
FAQ
What is DeepSeek-Coder-V2?
DeepSeek-Coder-V2 is an open-weight, code-specialized Mixture-of-Experts model released by DeepSeek on June 17, 2024. It comes as a 236B-total / 21B-active flagship and a 16B-total / 2.4B-active Lite version, each with base and instruction-tuned checkpoints, and was further pre-trained from a DeepSeek-V2 checkpoint on an extra 6 trillion tokens.
How well does DeepSeek-Coder-V2 perform on coding benchmarks?
The 236B Instruct model scores 90.2% on HumanEval, 76.2% on MBPP+, 43.4% on LiveCodeBench, and 73.7% on Aider, with 75.7% MATH and 94.9% GSM8K for math. DeepSeek reports performance comparable to GPT-4-Turbo and ahead of Claude 3 Opus and Gemini 1.5 Pro of that period.
How many languages and how much context does it support?
It supports 338 programming languages (up from 86 in DeepSeek Coder v1) and a 128K-token context window (extended from 16K).
Is DeepSeek-Coder-V2 free and open source?
Yes. The weights are downloadable from Hugging Face under an MIT license for code plus a DeepSeek Model License that permits commercial use. The original hosted deepseek-coder API endpoint was later merged into DeepSeek-V2.5 in September 2024.