DeepSeek Coder

Name: DeepSeek Coder
Author: DeepSeek

DeepSeek's first open code-LLM family (1.3B–33B), trained from scratch on 2T tokens with a 16K window.

Overview

DeepSeek Coder was the first model family shipped by Chinese AI lab DeepSeek, released on 2 November 2023. It is a series of open code language models trained from scratch on roughly 2 trillion tokens (87% source code and 13% natural language in English and Chinese), spanning more than 80 programming languages. The line comes in four sizes — DeepSeek-Coder-1.3B, 5.7B, 6.7B, and 33B — each shipped in a Base (pre-trained) and an Instruct (instruction-tuned) variant.

Every model uses a 16K-token context window and was pre-trained at the repository level with a fill-in-the-middle (FIM) objective, so it handles project-level code completion and infilling rather than just isolated snippets. At launch, DeepSeek-Coder-Base-33B was the strongest open code model of its size, and the instruction-tuned DeepSeek-Coder-Instruct-33B was reported to outperform GPT-3.5-turbo on HumanEval and match it on MBPP.

DeepSeek Coder has since been superseded: DeepSeek-Coder-V2 (a much larger Mixture-of-Experts model with a far longer context) launched in June 2024, and DeepSeek lists the original line as discontinued. The original weights remain downloadable on Hugging Face under the DeepSeek Model License, which permits commercial use, so the models are still useful as small, self-hostable code assistants.

Released	2023-11-02
License	Code under MIT License; model weights under the DeepSeek Model License (permits commercial use with responsible-use restrictions).
Weights	Open weights
Parameters	Family: 1.3B, 5.7B (MoE), 6.7B, and 33B
Context	16K tokens
Max output	Not separately specified — bounded by the 16K context window
Architecture	Decoder-only transformer code LLM, pre-trained from scratch on ~2T tokens (87% source code, 13% English/Chinese natural language) at the repository level with a 16K window and a fill-in-the-blank (FIM) objective; Instruct variants fine-tuned on instruction data.
Knowledge cutoff	Not officially published
Modalities	text, code
Status	Superseded — the original DeepSeek Coder line (Nov 2023) was replaced by DeepSeek-Coder-V2 in June 2024 and is listed by DeepSeek as discontinued. The open weights remain available for download and self-hosting.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Fully open weights with a license that allows commercial use, ideal for on-prem / air-gapped code assistants
Small sizes (1.3B, 6.7B) run on a single consumer GPU, with the 33B for higher quality
16K context plus repository-level + fill-in-the-middle training suits code completion and infilling, not just chat
Strong-for-its-era HumanEval/MBPP results; the 33B Instruct beat GPT-3.5-turbo on HumanEval at launch
Broad language coverage — 80+ programming languages, including multilingual HumanEval evaluation

Best for

Self-hosted code completion and infilling in an IDE or editor plugin
Local / offline code generation where data cannot leave the network
Fine-tuning a small, permissively-licensed base model for a domain-specific coding assistant
Repository-level autocomplete and FIM tasks using the 16K window
Research and benchmarking baselines for open code-LLM work

How to access

Provider	Model ID
Hugging Face (deepseek-ai) ↗	`deepseek-ai/deepseek-coder-33b-instruct`
Hugging Face (deepseek-ai) ↗	`deepseek-ai/deepseek-coder-6.7b-instruct`
Ollama ↗	`deepseek-coder`

DeepSeek Coder — every version

The full lineage of the DeepSeek Coder line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
DeepSeek-Coder-V2current	2024-06-17	—	Open weights
DeepSeek Coder	2023-11-02	—	Open weights

FAQ

Is DeepSeek Coder open source?

The weights are open and free to download on Hugging Face. The code repository is MIT-licensed, and the model weights are under the DeepSeek Model License, which permits commercial use subject to responsible-use restrictions. It is best described as open-weight (source-available) rather than a fully OSI-approved license.

What sizes does DeepSeek Coder come in?

Four sizes — 1.3B, 5.7B, 6.7B, and 33B parameters — each available as a Base (pre-trained) model and an Instruct (instruction-tuned) model. All share a 16K-token context window.

Is DeepSeek Coder still the latest version?

No. The original DeepSeek Coder was released in November 2023 and was superseded by DeepSeek-Coder-V2 in June 2024, which uses a Mixture-of-Experts architecture and a much longer context. DeepSeek lists the original line as discontinued, but the weights remain downloadable.

How well does DeepSeek Coder do on coding benchmarks?

Per the DeepSeek-Coder paper, the 33B Instruct model scores about 79.3% Pass@1 on HumanEval (Python) and 70% on MBPP — at launch it outperformed GPT-3.5-turbo on HumanEval. The 33B Base scores 56.1% on HumanEval, with smaller sizes scaling down (49.4% for 6.7B, 34.8% for 1.3B).

// Overview

// Benchmarks

// Strengths

// Best for

// How to access

// DeepSeek Coder — every version

// FAQ