Overview
Kimi K2.6 is Moonshot AI's flagship open-weight model in the Kimi K2 line, released on April 20, 2026. It is a 1-trillion-parameter Mixture-of-Experts (MoE) model that activates 32 billion parameters per token, extending the K2 family's focus on autonomous, long-horizon agentic coding. The weights are published on Hugging Face (moonshotai/Kimi-K2.6) under a Modified MIT license, and the model is served on Kimi.com, the Kimi app, the official Moonshot API, and the Kimi Code CLI.
Architecturally, Kimi K2.6 uses 384 experts (8 routed plus 1 shared per token) with Multi-head Latent Attention to keep the KV-cache footprint small, and folds in a 400M-parameter MoonViT encoder so the single model natively reads images and video alongside text. It supports a 256K-token (262,144) context window and ships in native INT4 quantization produced via Quantization-Aware Training, keeping the ~594GB checkpoint deployable on standard inference stacks like vLLM, SGLang, and KTransformers.
The release leans hard into agentic workflows: K2.6 offers Instant, Thinking, Agent, and Agent Swarm variants, with the swarm mode scaling to 300 domain-specialized sub-agents executing up to 4,000 coordinated steps in a single autonomous run. On coding and agentic benchmarks it trades blows with the leading proprietary frontier models of its day while remaining fully open-weight, which is the core of its appeal.
| Released | 2026-04-20 |
|---|---|
| License | Modified MIT (open weights; "Kimi K2" UI attribution required above 100M MAU or $20M monthly revenue) |
| Weights | Open weights |
| Parameters | 1T total (MoE), 32B active per token |
| Context | 256K |
| Max output | 256K |
| Architecture | Mixture-of-Experts transformer with 384 experts (8 routed + 1 shared per token) across 61 layers, Multi-head Latent Attention (MLA), SwiGLU activations, and a 400M-parameter MoonViT vision encoder for native image and video input. Ships in native INT4 quantization trained with Quantization-Aware Training (QAT). |
| Modalities | Text, Vision, Video |
| Status | Available |
Benchmarks
- SWE-Bench Verified80.2%
- SWE-Bench Pro58.6%
- Humanity's Last Exam (HLE-Full, with tools)54%
- LiveCodeBench v689.6%
- AIME 202696.4%
- GPQA-Diamond90.5%
- BrowseComp (Agent Swarm mode)86.3%
- MathVision (with Python)93.2%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.95 / 1M tokens per 1M tokens |
|---|---|
| Cached input | $0.16 / 1M tokens per 1M tokens |
| Output | $4.00 / 1M tokens per 1M tokens |
Official Moonshot API pricing. Third-party providers (OpenRouter, DeepInfra, Fireworks) list their own rates.
Strengths
- State-of-the-art open-weight performance on agentic coding benchmarks (e.g. SWE-Bench Pro 58.6, SWE-Bench Verified 80.2)
- Fully open weights under a permissive Modified MIT license, self-hostable on vLLM/SGLang/KTransformers
- Native multimodal input (text, image, video) via the built-in MoonViT encoder
- Agent Swarm orchestration scaling to 300 sub-agents and 4,000 coordinated steps for long-horizon autonomous tasks
- 256K-token context window for large codebases and documents
- Aggressive pricing with an ~83% discount on cached input tokens
Best for
- Autonomous, multi-step software engineering and bug-fixing over large repositories
- Multi-agent orchestration and long-horizon research/coding runs
- Self-hosted deployment where open weights and data control matter
- Deep web research and agentic search workflows
- Coding-driven UI/UX generation from prompts and visual inputs
- Math and reasoning tasks with thinking mode enabled
How to access
| Provider | Model ID |
|---|---|
| Moonshot AI (Kimi) ↗ | kimi-k2.6 |
| OpenRouter ↗ | moonshotai/kimi-k2.6 |
| Cloudflare Workers AI ↗ | kimi-k2.6 |
| DeepInfra ↗ | moonshotai/Kimi-K2.6 |
Kimi K2 — every version
The full lineage of the Kimi K2 line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Kimi K2.7-Codecurrent | 2026-06-12 | 256K | Modified MIT |
| Kimi K2.6 | 2026-04-20 | — | Open weights |
| Kimi K2.5 | 2026-01-27 | — | Open weights |
| Kimi K2-Instruct-0905 | 2025-09-09 | — | Open weights |
| Kimi K2 | 2025-07-11 | — | MIT |
FAQ
Is Kimi K2.6 open source?
The weights are open and published on Hugging Face (moonshotai/Kimi-K2.6) under a Modified MIT license. Commercial use is permitted, but products above 100M monthly active users or $20M monthly revenue must display a "Kimi K2" attribution in their UI.
How big is Kimi K2.6 and how is it built?
It is a 1-trillion-parameter Mixture-of-Experts model that activates 32 billion parameters per token, using 384 experts (8 routed plus 1 shared) with Multi-head Latent Attention and a 400M-parameter MoonViT vision encoder. It supports a 256K-token context window and ships in native INT4 quantization.
What is Kimi K2.6 best at?
Agentic coding and long-horizon autonomous tasks. It posts strong open-weight scores like 80.2 on SWE-Bench Verified and 58.6 on SWE-Bench Pro, and its Agent Swarm mode coordinates up to 300 sub-agents across as many as 4,000 steps in a single run.
How much does Kimi K2.6 cost?
On the official Moonshot API, K2.6 is priced at $0.95 per million input tokens, $0.16 per million cached input tokens, and $4.00 per million output tokens. Third-party providers such as OpenRouter and DeepInfra publish their own rates.
