AI/TLDR

Kimi-Dev-72B

Moonshot AI's open-weight 72B coding model for fixing real software bugs

Overview

Kimi-Dev-72B is an open-weight, 72-billion-parameter large language model from Moonshot AI (the team behind Kimi), released on June 16, 2025 and purpose-built for software engineering. Rather than a general chat assistant, Kimi-Dev-72B is tuned for the concrete loop developers live in: reading a bug report or GitHub issue, finding the file that needs to change, writing the fix, and producing unit tests that prove it works.

The model starts from the Qwen2.5-72B base and is shaped in two stages. First, a roughly 150-billion-token mid-training phase exposes it to real GitHub issues and pull-request commits so it learns how human engineers reason about defects. Then large-scale reinforcement learning lets Kimi-Dev-72B autonomously patch real repositories inside Docker containers, earning a reward only when the entire test suite passes — a strict, execution-grounded signal that pushes it toward fixes that actually run.

At launch, Kimi-Dev-72B scored 60.4% on SWE-bench Verified, which Moonshot reported as a new state of the art among open-source models. It is released under the permissive MIT license with full weights on Hugging Face, supports a 131K-token context window, and can be self-hosted through vLLM, SGLang, or Hugging Face Transformers, making it practical for teams that want a capable coding model they fully control.

Released2025-06
LicenseMIT
WeightsOpen weights
Parameters72B
Context131K
ArchitectureDense transformer fine-tuned from Qwen2.5-72B; ~150B-token mid-training on GitHub issues and PR commits, then large-scale reinforcement learning that patches real repos in Docker and rewards only when the full test suite passes.
Knowledge cutoffNot disclosed
ModalitiesText
StatusAvailable

Benchmarks

  1. SWE-bench Verified60.4%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.29 per 1M tokens
Output$1.15 per 1M tokens

Kimi-Dev-72B is open-weight and not sold on Moonshot's first-party API; these are third-party hosting rates on OpenRouter, which also lists a free tier (moonshotai/kimi-dev-72b:free). Self-hosting cost is your own compute.

Pricing source ↗

Strengths

  • State-of-the-art open-source result on SWE-bench Verified (60.4%) at its June 2025 release
  • Permissive MIT license with full open weights — unrestricted commercial and private use
  • Trained with execution-grounded RL: rewarded only when a patch passes the full Docker test suite, favoring fixes that genuinely run
  • 131K-token context handles large files and multi-file repository tasks
  • Self-hostable via vLLM, SGLang, and Transformers, with broad community quantizations (GGUF, GPTQ, MLX) for local use

Best for

  • Automated bug fixing and GitHub issue resolution against real repositories
  • Generating and repairing unit tests for changed code
  • Fault localization — pinpointing which file or function needs editing
  • Code review and patch suggestion inside engineering workflows
  • Self-hosted, license-clean coding assistant for teams that need on-prem control

How to access

ProviderModel ID
OpenRouter ↗moonshotai/kimi-dev-72b

FAQ

What is Kimi-Dev-72B?

Kimi-Dev-72B is an open-weight, 72-billion-parameter coding model from Moonshot AI, released June 16, 2025. It is built on Qwen2.5-72B and specialized for software engineering tasks like bug fixing, GitHub issue resolution, and unit-test generation.

How well does Kimi-Dev-72B perform on SWE-bench?

It scores 60.4% on SWE-bench Verified, which Moonshot AI reported as a state-of-the-art result among open-source models at its release.

Is Kimi-Dev-72B open source and free to use?

Yes. The full weights are published on Hugging Face under the permissive MIT license, allowing commercial and private use. You can self-host it, and hosted access is available on OpenRouter, which also offers a free tier.

What context length does Kimi-Dev-72B support?

It supports a 131K-token (128K) context window, enough for large source files and multi-file repository tasks. The model is text-only and can be served with vLLM, SGLang, or Hugging Face Transformers.