Overview
Qwen3-Coder is Alibaba's flagship open-weight coding model, released on July 22, 2025 by the Qwen team. The headline variant, Qwen3-Coder-480B-A35B-Instruct, is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token (160 experts, 8 routed). It is built specifically for agentic software engineering: writing, editing, and debugging code across multi-step tool-calling loops rather than single-shot completions.
The model supports a 256K-token context window natively (262,144 tokens) and stretches to 1 million tokens using YaRN extrapolation, which Alibaba positions for repository-scale understanding. It was pre-trained on 7.5 trillion tokens, about 70% of which was code, then post-trained with long-horizon (Agent RL) reinforcement learning so it can solve real tasks through many turns of tool use. Qwen3-Coder operates in non-thinking mode only and does not emit reasoning trace blocks.
Qwen3-Coder is released under the permissive Apache 2.0 license with weights freely available on Hugging Face, GitHub, and Ollama, plus hosted API access through Alibaba Cloud Model Studio and providers like OpenRouter and Together AI. Alibaba shipped an open-source CLI, Qwen Code (adapted from Gemini CLI), and the model also works with Claude Code (via a router) and Cline. A smaller 30B-A3B variant is available for local use.
| Released | 2025-07-22 |
|---|---|
| License | Apache 2.0 |
| Weights | Open weights |
| Parameters | 480B total / 35B active (MoE) |
| Context | 256K (1M with YaRN) |
| Max output | 65,536 tokens |
| Architecture | Mixture-of-Experts causal language model: 480B total parameters with 35B activated per token, 160 experts (8 routed per token), 62 layers, grouped-query attention (96 query heads / 8 key-value heads). Native 256K context (262,144 tokens), extendable to 1M tokens via YaRN. Pre-trained on 7.5T tokens with roughly 70% code data; post-trained with long-horizon agentic reinforcement learning. Runs in non-thinking mode only (no <think> blocks). |
| Modalities | Text |
| Status | Available |
Benchmarks
- SWE-bench Verified (standalone)67%
- SWE-bench Verified (OpenHands, 500 turns)69.6%
- Agentic Browser-Use (vs Claude Sonnet 4 47.4)49.9%
- Agentic Tool-Use (vs Claude Sonnet 4 65.2)68.7%
Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.
Pricing
| Input | $0.22 / 1M tokens per 1M tokens |
|---|---|
| Output | $1.80 / 1M tokens per 1M tokens |
Pricing shown for the Alibaba-hosted endpoint via OpenRouter; rates increase for requests above 128K input tokens. Open weights are free to self-host.
Strengths
- Open weights under Apache 2.0 — fully commercial-friendly, self-hostable, and downloadable from Hugging Face, GitHub, and Ollama
- State-of-the-art agentic coding among open models, with SWE-bench Verified results approaching Claude Sonnet 4
- Very large context: 256K tokens natively, up to 1M with YaRN, suited to repository-scale code understanding
- Efficient MoE design — only 35B of 480B parameters activate per token, lowering inference cost relative to dense models of similar capability
- Purpose-built for tool use and multi-turn agent loops via long-horizon reinforcement learning
- First-party open-source CLI (Qwen Code) plus compatibility with Claude Code and Cline
Best for
- Autonomous agentic coding: multi-step bug fixing, feature implementation, and refactoring inside an agent loop
- Repository-scale code analysis and editing that exploits the 256K-to-1M context window
- Self-hosted or private-cloud code assistants where open weights and Apache 2.0 licensing matter
- Powering terminal coding agents via the Qwen Code CLI, Claude Code, or Cline
- Browser-use and tool-use automation tasks that require sustained multi-turn reasoning
- Code generation and completion across many programming languages for cost-sensitive, high-volume workloads
How to access
| Provider | Model ID |
|---|---|
| Alibaba Cloud Model Studio ↗ | qwen3-coder-plus |
| OpenRouter ↗ | qwen/qwen3-coder |
| Together AI ↗ | Qwen/Qwen3-Coder-480B-A35B-Instruct |
| Ollama ↗ | qwen3-coder |
Qwen-Coder — every version
The full lineage of the Qwen-Coder line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.
| Version | Released | Context | License |
|---|---|---|---|
| Qwen3-Codercurrent | 2025-07-22 | — | Apache-2.0 |
| Qwen2.5-Coder | 2024-11 | — | Open weights |
FAQ
Is Qwen3-Coder open source?
Yes. Qwen3-Coder is released under the permissive Apache 2.0 license, and the weights are freely downloadable from Hugging Face, GitHub, and Ollama. You can self-host it or use a hosted API.
How big is Qwen3-Coder and how much context does it handle?
The flagship Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token. It supports a 256K-token context natively (262,144 tokens) and up to 1 million tokens using YaRN extrapolation.
How does Qwen3-Coder compare to Claude Sonnet 4?
Alibaba positions Qwen3-Coder as state of the art among open models for agentic coding, browser-use, and tool-use, comparable to Claude Sonnet 4. On SWE-bench Verified it scores 67.0% standalone and 69.6% with the OpenHands scaffold at 500 turns, close to Claude Sonnet 4's reported figures.
How can I use Qwen3-Coder?
You can run the open weights locally (a smaller 30B-A3B variant exists for lighter hardware), call it through Alibaba Cloud Model Studio, OpenRouter, Together AI, or Ollama, and drive it with the open-source Qwen Code CLI, Claude Code (via a router), or Cline.