Qwen3-Coder

Alibaba's open-weight 480B agentic coding model with 256K-to-1M context

Overview

Qwen3-Coder is Alibaba's flagship open-weight coding model, released on July 22, 2025 by the Qwen team. The headline variant, Qwen3-Coder-480B-A35B-Instruct, is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token (160 experts, 8 routed). It is built specifically for agentic software engineering: writing, editing, and debugging code across multi-step tool-calling loops rather than single-shot completions.

The model supports a 256K-token context window natively (262,144 tokens) and stretches to 1 million tokens using YaRN extrapolation, which Alibaba positions for repository-scale understanding. It was pre-trained on 7.5 trillion tokens, about 70% of which was code, then post-trained with long-horizon (Agent RL) reinforcement learning so it can solve real tasks through many turns of tool use. Qwen3-Coder operates in non-thinking mode only and does not emit reasoning trace blocks.

Qwen3-Coder is released under the permissive Apache 2.0 license with weights freely available on Hugging Face, GitHub, and Ollama, plus hosted API access through Alibaba Cloud Model Studio and providers like OpenRouter and Together AI. Alibaba shipped an open-source CLI, Qwen Code (adapted from Gemini CLI), and the model also works with Claude Code (via a router) and Cline. A smaller 30B-A3B variant is available for local use.

Released	2025-07-22
License	Apache 2.0
Weights	Open weights
Parameters	480B total / 35B active (MoE)
Context	256K (1M with YaRN)
Max output	65,536 tokens
Architecture	Mixture-of-Experts causal language model: 480B total parameters with 35B activated per token, 160 experts (8 routed per token), 62 layers, grouped-query attention (96 query heads / 8 key-value heads). Native 256K context (262,144 tokens), extendable to 1M tokens via YaRN. Pre-trained on 7.5T tokens with roughly 70% code data; post-trained with long-horizon agentic reinforcement learning. Runs in non-thinking mode only (no <think> blocks).
Modalities	Text
Status	Available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.22 / 1M tokens per 1M tokens
Output	$1.80 / 1M tokens per 1M tokens

Pricing shown for the Alibaba-hosted endpoint via OpenRouter; rates increase for requests above 128K input tokens. Open weights are free to self-host.

Pricing source ↗

Strengths

Open weights under Apache 2.0 — fully commercial-friendly, self-hostable, and downloadable from Hugging Face, GitHub, and Ollama
State-of-the-art agentic coding among open models, with SWE-bench Verified results approaching Claude Sonnet 4
Very large context: 256K tokens natively, up to 1M with YaRN, suited to repository-scale code understanding
Efficient MoE design — only 35B of 480B parameters activate per token, lowering inference cost relative to dense models of similar capability
Purpose-built for tool use and multi-turn agent loops via long-horizon reinforcement learning
First-party open-source CLI (Qwen Code) plus compatibility with Claude Code and Cline

Best for

Autonomous agentic coding: multi-step bug fixing, feature implementation, and refactoring inside an agent loop
Repository-scale code analysis and editing that exploits the 256K-to-1M context window
Self-hosted or private-cloud code assistants where open weights and Apache 2.0 licensing matter
Powering terminal coding agents via the Qwen Code CLI, Claude Code, or Cline
Browser-use and tool-use automation tasks that require sustained multi-turn reasoning
Code generation and completion across many programming languages for cost-sensitive, high-volume workloads

How to access

Provider	Model ID
Alibaba Cloud Model Studio ↗	`qwen3-coder-plus`
OpenRouter ↗	`qwen/qwen3-coder`
Together AI ↗	`Qwen/Qwen3-Coder-480B-A35B-Instruct`
Ollama ↗	`qwen3-coder`

Qwen-Coder — every version

The full lineage of the Qwen-Coder line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Qwen3-Codercurrent	2025-07-22	—	Apache-2.0
Qwen2.5-Coder	2024-11	—	Open weights

FAQ

Is Qwen3-Coder open source?

Yes. Qwen3-Coder is released under the permissive Apache 2.0 license, and the weights are freely downloadable from Hugging Face, GitHub, and Ollama. You can self-host it or use a hosted API.

How big is Qwen3-Coder and how much context does it handle?

The flagship Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token. It supports a 256K-token context natively (262,144 tokens) and up to 1 million tokens using YaRN extrapolation.

How does Qwen3-Coder compare to Claude Sonnet 4?

Alibaba positions Qwen3-Coder as state of the art among open models for agentic coding, browser-use, and tool-use, comparable to Claude Sonnet 4. On SWE-bench Verified it scores 67.0% standalone and 69.6% with the OpenHands scaffold at 500 turns, close to Claude Sonnet 4's reported figures.

How can I use Qwen3-Coder?

You can run the open weights locally (a smaller 30B-A3B variant exists for lighter hardware), call it through Alibaba Cloud Model Studio, OpenRouter, Together AI, or Ollama, and drive it with the open-source Qwen Code CLI, Claude Code (via a router), or Cline.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Qwen-Coder — every version

// FAQ