Kimi-Dev-72B

Moonshot AI's open-weight 72B coding model for fixing real software bugs

Overview

Kimi-Dev-72B is an open-weight, 72-billion-parameter large language model from Moonshot AI (the team behind Kimi), released on June 16, 2025 and purpose-built for software engineering. Rather than a general chat assistant, Kimi-Dev-72B is tuned for the concrete loop developers live in: reading a bug report or GitHub issue, finding the file that needs to change, writing the fix, and producing unit tests that prove it works.

The model starts from the Qwen2.5-72B base and is shaped in two stages. First, a roughly 150-billion-token mid-training phase exposes it to real GitHub issues and pull-request commits so it learns how human engineers reason about defects. Then large-scale reinforcement learning lets Kimi-Dev-72B autonomously patch real repositories inside Docker containers, earning a reward only when the entire test suite passes — a strict, execution-grounded signal that pushes it toward fixes that actually run.

At launch, Kimi-Dev-72B scored 60.4% on SWE-bench Verified, which Moonshot reported as a new state of the art among open-source models. It is released under the permissive MIT license with full weights on Hugging Face, supports a 131K-token context window, and can be self-hosted through vLLM, SGLang, or Hugging Face Transformers, making it practical for teams that want a capable coding model they fully control.

Released	2025-06
License	MIT
Weights	Open weights
Parameters	72B
Context	131K
Architecture	Dense transformer fine-tuned from Qwen2.5-72B; ~150B-token mid-training on GitHub issues and PR commits, then large-scale reinforcement learning that patches real repos in Docker and rewards only when the full test suite passes.
Knowledge cutoff	Not disclosed
Modalities	Text
Status	Available

Benchmarks

SWE-bench Verified60.4%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.29 per 1M tokens
Output	$1.15 per 1M tokens

Kimi-Dev-72B is open-weight and not sold on Moonshot's first-party API; these are third-party hosting rates on OpenRouter, which also lists a free tier (moonshotai/kimi-dev-72b:free). Self-hosting cost is your own compute.

Pricing source ↗

Strengths

State-of-the-art open-source result on SWE-bench Verified (60.4%) at its June 2025 release
Permissive MIT license with full open weights — unrestricted commercial and private use
Trained with execution-grounded RL: rewarded only when a patch passes the full Docker test suite, favoring fixes that genuinely run
131K-token context handles large files and multi-file repository tasks
Self-hostable via vLLM, SGLang, and Transformers, with broad community quantizations (GGUF, GPTQ, MLX) for local use

Best for

Automated bug fixing and GitHub issue resolution against real repositories
Generating and repairing unit tests for changed code
Fault localization — pinpointing which file or function needs editing
Code review and patch suggestion inside engineering workflows
Self-hosted, license-clean coding assistant for teams that need on-prem control

How to access

Provider	Model ID
OpenRouter ↗	`moonshotai/kimi-dev-72b`

FAQ

What is Kimi-Dev-72B?

Kimi-Dev-72B is an open-weight, 72-billion-parameter coding model from Moonshot AI, released June 16, 2025. It is built on Qwen2.5-72B and specialized for software engineering tasks like bug fixing, GitHub issue resolution, and unit-test generation.

How well does Kimi-Dev-72B perform on SWE-bench?

It scores 60.4% on SWE-bench Verified, which Moonshot AI reported as a state-of-the-art result among open-source models at its release.

Is Kimi-Dev-72B open source and free to use?

Yes. The full weights are published on Hugging Face under the permissive MIT license, allowing commercial and private use. You can self-host it, and hosted access is available on OpenRouter, which also offers a free tier.

What context length does Kimi-Dev-72B support?

It supports a 131K-token (128K) context window, enough for large source files and multi-file repository tasks. The model is text-only and can be served with vLLM, SGLang, or Hugging Face Transformers.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// FAQ