Is DSpark a new DeepSeek model?

DSpark is not a new model. DeepSeek-V4-Pro-DSpark and DeepSeek-V4-Flash-DSpark are the same V4-Pro and V4-Flash checkpoints with an additional speculative-decoding drafter attached — quality is unchanged; the drafter just lets inference produce multiple verified tokens per forward pass.

What's the difference between DeepSpec and DSpark?

DeepSpec is the training codebase — the pipeline for downloading prompts, caching target outputs, training a drafter, and evaluating acceptance. DSpark is one of the three drafter algorithms it ships with (the others are DFlash and Eagle3) and is the algorithm DeepSeek uses for its own V4 drafters.

Which models can I train a DSpark drafter for?

DeepSpec ships example configurations for the Qwen3 and Gemma target families, and DeepSeek released DSpark drafters for its own DeepSeek V4-Pro and V4-Flash checkpoints on Hugging Face. The training pipeline is generic, so other open-weight targets can be added by writing a config.

What does it cost to train one of these drafters?

DeepSpec's default training configuration assumes a single node with 8 GPUs and the data-prep step alone can build a target-output cache of about 38 TB for the example Qwen3-4B run. The repo notes you can scale CUDA settings down for smaller rigs, but disk is the limiting factor at default settings.

What's the license, and can I use it commercially?

DeepSpec, the DSpark drafters, and the underlying DeepSeek V4-Pro and V4-Flash checkpoints are all released under the MIT License, so commercial use is permitted. The MIT terms apply to the code in the repository and the weights on Hugging Face.

DeepSeek · 2026-06-26 · major

DSpark + DeepSpec — DeepSeek opens its speculative decoding stack

DeepSeek released DeepSpec, an MIT-licensed codebase to train and evaluate draft models for speculative decoding, plus DSpark speculative-decoding modules attached to its V4-Pro and V4-Flash checkpoints on Hugging Face.

DeepSpec GitHub repository social card from DeepSeek

DeepSeek shipped a free codebase for training speculative-decoding drafters, plus DSpark drafters bolted onto V4-Pro and V4-Flash.

Quick facts

Maker	DeepSeek
License	MIT
Codebase	github.com/deepseek-ai/DeepSpec
Draft algorithms	DSpark, DFlash, Eagle3
Target families	Qwen3, Gemma (in repo); DeepSeek V4 Pro/Flash (on HF)
Default training rig	1 node × 8 GPUs
Default cache footprint	~38 TB for Qwen3-4B target

What is it?

DeepSpec is a full-stack codebase from DeepSeek for training and evaluating draft models used in speculative decoding. The same release adds DSpark drafters as separate Hugging Face uploads (DeepSeek-V4-Pro-DSpark and DeepSeek-V4-Flash-DSpark) that attach to existing V4 checkpoints instead of replacing them. Everything ships MIT-licensed.

How does it work?

Speculative decoding pairs a fast 'draft' model with the real target model — the drafter proposes several future tokens at once, then the target verifies them in one pass so accepted tokens skip ahead instead of being generated one-by-one. DeepSpec packages the pipeline end-to-end: it downloads prompts, regenerates target outputs into a cached corpus, trains the drafter against that cache, then evaluates acceptance on gsm8k, MATH-500, AIME25, HumanEval, MBPP, LiveCodeBench, MT-Bench, Alpaca and Arena-Hard-v2. The included drafter recipes cover three algorithms — DSpark, DFlash and Eagle3 — with example configs for Qwen3 and Gemma targets.

Why does it matter?

Inference is where the bill is paid, and speculative decoding is the cheapest way to speed it up — but the draft model is the hard part to train. By open-sourcing both the training stack and the DSpark drafters for its frontier V4 checkpoints, DeepSeek lets self-hosters and providers cut tokens-per-second cost without changing the underlying model. It also pulls another performance lever previously owned by closed labs into the open ecosystem.

Who is it for?

inference providers, self-hosters, and research teams running open-weight models

Frequently asked questions

Is DSpark a new DeepSeek model?: DSpark is not a new model. DeepSeek-V4-Pro-DSpark and DeepSeek-V4-Flash-DSpark are the same V4-Pro and V4-Flash checkpoints with an additional speculative-decoding drafter attached — quality is unchanged; the drafter just lets inference produce multiple verified tokens per forward pass.
What's the difference between DeepSpec and DSpark?: DeepSpec is the training codebase — the pipeline for downloading prompts, caching target outputs, training a drafter, and evaluating acceptance. DSpark is one of the three drafter algorithms it ships with (the others are DFlash and Eagle3) and is the algorithm DeepSeek uses for its own V4 drafters.
Which models can I train a DSpark drafter for?: DeepSpec ships example configurations for the Qwen3 and Gemma target families, and DeepSeek released DSpark drafters for its own DeepSeek V4-Pro and V4-Flash checkpoints on Hugging Face. The training pipeline is generic, so other open-weight targets can be added by writing a config.
What does it cost to train one of these drafters?: DeepSpec's default training configuration assumes a single node with 8 GPUs and the data-prep step alone can build a target-output cache of about 38 TB for the example Qwen3-4B run. The repo notes you can scale CUDA settings down for smaller rigs, but disk is the limiting factor at default settings.
What's the license, and can I use it commercially?: DeepSpec, the DSpark drafters, and the underlying DeepSeek V4-Pro and V4-Flash checkpoints are all released under the MIT License, so commercial use is permitted. The MIT terms apply to the code in the repository and the weights on Hugging Face.

Try it

git clone https://github.com/deepseek-ai/DeepSpec