Kimi K1.5

Moonshot AI's RL-trained multimodal reasoning model that matched OpenAI o1 on math, code and vision.

Overview

Kimi K1.5 is a reasoning-focused large language model from Moonshot AI, the company behind the Kimi assistant. Released on 2025-01-20, it was Moonshot's first model built around large-scale reinforcement learning and is multimodal: it is jointly trained on text and vision data so it can reason over images such as charts and diagrams as well as plain text. Moonshot reported that Kimi K1.5 matched OpenAI's o1 across mathematics, coding, and multimodal reasoning.

The model is described in the technical report "Kimi k1.5: Scaling Reinforcement Learning with LLMs" (arXiv:2501.12599) by the Kimi Team. A central theme is scaling: the team scales the RL context window to 128k tokens and shows reasoning keeps improving as context grows, using a deliberately simple RL recipe that avoids the more complex machinery (tree search, value functions, process reward models) common in other reasoning systems.

Kimi K1.5 comes in two flavours. A long chain-of-thought mode targets the hardest problems and reaches o1-level scores on math and competitive coding. A short chain-of-thought mode distills those long-reasoning techniques into a faster, more compact model that Moonshot reports beats GPT-4o and Claude 3.5 Sonnet on several reasoning benchmarks. Unlike Moonshot's later K2 line, Kimi K1.5 was not released as open weights; it was offered through the Kimi product rather than as a downloadable model.

Released	2025-01-20
License	Proprietary
Weights	API only
Parameters	Not disclosed
Context	128K
Architecture	A multimodal large language model trained jointly on text and vision data, then improved with large-scale reinforcement learning. The Kimi Team describes a deliberately simple, effective RL framework that reaches strong reasoning without relying on Monte Carlo tree search, value functions, or separate process reward models. The RL context window is scaled to 128k tokens, which the team reports keeps improving performance as the context grows. The model ships in two inference styles: a long chain-of-thought ("long-CoT") mode for hard reasoning and a short chain-of-thought ("short-CoT") mode that uses long-CoT techniques to make a more compact model far stronger.
Knowledge cutoff	Not disclosed
Modalities	Text, Vision
Status	Available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Strengths

Strong mathematical reasoning — reported 77.5 on AIME 2024 and 96.2 on MATH-500 in long chain-of-thought mode
Competitive-programming ability at roughly the 94th percentile on Codeforces
Native multimodal reasoning over text and images (e.g. charts, diagrams), with 74.9 on MathVista
Long 128k-token context scaled directly during reinforcement learning
A short-CoT mode that delivers strong reasoning at lower latency, reported to beat GPT-4o and Claude 3.5 Sonnet on several benchmarks
Simple, reproducible RL recipe documented in a public technical report

Best for

Solving hard mathematics and competition-style problems
Coding and algorithmic problem solving
Multimodal reasoning over images, charts, and diagrams alongside text
Long-document analysis that benefits from the 128k context window
Step-by-step explanations where visible chain-of-thought reasoning helps

FAQ

What is Kimi K1.5?

Kimi K1.5 is a multimodal reasoning large language model from Moonshot AI, released on 2025-01-20. It is trained on text and vision data and then improved with large-scale reinforcement learning, and Moonshot reported it matched OpenAI o1 on math, coding, and multimodal reasoning.

Is Kimi K1.5 open source?

No. Unlike Moonshot AI's later Kimi K2 line, Kimi K1.5's weights were not released. It was offered through the Kimi product rather than as a downloadable open-weights model. The technical report (arXiv:2501.12599) documents the methods, but no model weights were published.

Is Kimi K1.5 multimodal?

Yes. The Kimi Team describes K1.5 as a multimodal LLM jointly trained on text and vision, so it can reason over images such as charts and diagrams as well as text. It scores 74.9 on the MathVista visual-math benchmark in long chain-of-thought mode.

How does Kimi K1.5 perform on math and coding benchmarks?

In long chain-of-thought mode the technical report lists 77.5 on AIME 2024, 96.2 on MATH-500, and roughly the 94th percentile on Codeforces. Its short chain-of-thought mode reports 60.8 on AIME 2024, 94.6 on MATH-500, and 47.3 on LiveCodeBench.

// Overview

// Benchmarks

// Strengths

// Best for

// FAQ

Overview

Benchmarks

Strengths

Best for

FAQ