Question 1

What is Kimi Linear (48B-A3B)?

Accepted Answer

It is an open-weight large language model from Moonshot AI, released on October 31, 2025. It uses a hybrid linear-attention architecture and is a Mixture-of-Experts model with 48B total parameters but only about 3B active per token. It supports a 1M-token context window and ships under the MIT license.

Question 2

What makes Kimi Linear's architecture different?

Accepted Answer

Instead of full attention in every layer, it interleaves three layers of Kimi Delta Attention (KDA, a refined Gated DeltaNet) for every one Multi-Head Latent Attention (MLA) layer. Moonshot reports this 3:1 hybrid cuts KV-cache memory by up to 75% and gives up to 6x faster decoding at a 1M-token context, while matching or beating full attention on quality.

Question 3

Is Kimi Linear (48B-A3B) free and open source?

Accepted Answer

Yes. Both the Base and Instruct checkpoints are released on Hugging Face under the MIT license, so you can download, run, and fine-tune them yourself. The KDA kernels are also open-sourced with vLLM and Flash Linear Attention (FLA) support.

Question 4

How can I run Kimi Linear (48B-A3B)?

Accepted Answer

You can self-host the open weights from Hugging Face (vLLM with the FLA kernel is the recommended path), or use a third-party serverless provider such as Featherless AI. Moonshot's own hosted Kimi API currently lists its K2 models rather than Kimi Linear, and there is no published first-party per-token price for this model.

Released	2025-10-31
License	MIT
Weights	Open weights
Parameters	48B total, 3B active (MoE)
Context	1M
Architecture	Hybrid linear-attention Mixture-of-Experts. Stacks Kimi Delta Attention (KDA) — a refined Gated DeltaNet with fine-grained channel-wise gating — and full Multi-Head Latent Attention (MLA) layers in a 3:1 ratio (3 KDA layers per MLA layer). 48B total parameters with ~3B activated per token. Trained on 5.7T tokens.
Knowledge cutoff	Not disclosed
Modalities	Text
Status	Available

Provider	Model ID
Hugging Face (self-host / open weights) ↗	`moonshotai/Kimi-Linear-48B-A3B-Instruct`
Featherless AI (serverless) ↗	`moonshotai/Kimi-Linear-48B-A3B-Instruct`

Kimi Linear (48B-A3B)

Overview

Benchmarks

Strengths

Best for

How to access

FAQ

// Overview

// Benchmarks

// Strengths

// Best for

// How to access

// FAQ

Overview

Benchmarks

Strengths

Best for

How to access

FAQ