AI/TLDR

Qwen3-Coder

Alibaba's open-weight 480B agentic coding model with 256K-to-1M context

Overview

Qwen3-Coder is Alibaba's flagship open-weight coding model, released on July 22, 2025 by the Qwen team. The headline variant, Qwen3-Coder-480B-A35B-Instruct, is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token (160 experts, 8 routed). It is built specifically for agentic software engineering: writing, editing, and debugging code across multi-step tool-calling loops rather than single-shot completions.

The model supports a 256K-token context window natively (262,144 tokens) and stretches to 1 million tokens using YaRN extrapolation, which Alibaba positions for repository-scale understanding. It was pre-trained on 7.5 trillion tokens, about 70% of which was code, then post-trained with long-horizon (Agent RL) reinforcement learning so it can solve real tasks through many turns of tool use. Qwen3-Coder operates in non-thinking mode only and does not emit reasoning trace blocks.

Qwen3-Coder is released under the permissive Apache 2.0 license with weights freely available on Hugging Face, GitHub, and Ollama, plus hosted API access through Alibaba Cloud Model Studio and providers like OpenRouter and Together AI. Alibaba shipped an open-source CLI, Qwen Code (adapted from Gemini CLI), and the model also works with Claude Code (via a router) and Cline. A smaller 30B-A3B variant is available for local use.

Released2025-07-22
LicenseApache 2.0
WeightsOpen weights
Parameters480B total / 35B active (MoE)
Context256K (1M with YaRN)
Max output65,536 tokens
ArchitectureMixture-of-Experts causal language model: 480B total parameters with 35B activated per token, 160 experts (8 routed per token), 62 layers, grouped-query attention (96 query heads / 8 key-value heads). Native 256K context (262,144 tokens), extendable to 1M tokens via YaRN. Pre-trained on 7.5T tokens with roughly 70% code data; post-trained with long-horizon agentic reinforcement learning. Runs in non-thinking mode only (no <think> blocks).
ModalitiesText
StatusAvailable

Benchmarks

  1. SWE-bench Verified (standalone)67%
  2. SWE-bench Verified (OpenHands, 500 turns)69.6%
  3. Agentic Browser-Use (vs Claude Sonnet 4 47.4)49.9%
  4. Agentic Tool-Use (vs Claude Sonnet 4 65.2)68.7%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.22 / 1M tokens per 1M tokens
Output$1.80 / 1M tokens per 1M tokens

Pricing shown for the Alibaba-hosted endpoint via OpenRouter; rates increase for requests above 128K input tokens. Open weights are free to self-host.

Pricing source ↗

Strengths

  • Open weights under Apache 2.0 — fully commercial-friendly, self-hostable, and downloadable from Hugging Face, GitHub, and Ollama
  • State-of-the-art agentic coding among open models, with SWE-bench Verified results approaching Claude Sonnet 4
  • Very large context: 256K tokens natively, up to 1M with YaRN, suited to repository-scale code understanding
  • Efficient MoE design — only 35B of 480B parameters activate per token, lowering inference cost relative to dense models of similar capability
  • Purpose-built for tool use and multi-turn agent loops via long-horizon reinforcement learning
  • First-party open-source CLI (Qwen Code) plus compatibility with Claude Code and Cline

Best for

  • Autonomous agentic coding: multi-step bug fixing, feature implementation, and refactoring inside an agent loop
  • Repository-scale code analysis and editing that exploits the 256K-to-1M context window
  • Self-hosted or private-cloud code assistants where open weights and Apache 2.0 licensing matter
  • Powering terminal coding agents via the Qwen Code CLI, Claude Code, or Cline
  • Browser-use and tool-use automation tasks that require sustained multi-turn reasoning
  • Code generation and completion across many programming languages for cost-sensitive, high-volume workloads

How to access

ProviderModel ID
Alibaba Cloud Model Studio ↗qwen3-coder-plus
OpenRouter ↗qwen/qwen3-coder
Together AI ↗Qwen/Qwen3-Coder-480B-A35B-Instruct
Ollama ↗qwen3-coder

Qwen-Coder — every version

The full lineage of the Qwen-Coder line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Qwen3-Codercurrent2025-07-22Apache-2.0
Qwen2.5-Coder2024-11Open weights

FAQ

Is Qwen3-Coder open source?

Yes. Qwen3-Coder is released under the permissive Apache 2.0 license, and the weights are freely downloadable from Hugging Face, GitHub, and Ollama. You can self-host it or use a hosted API.

How big is Qwen3-Coder and how much context does it handle?

The flagship Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts model with 480 billion total parameters but only 35 billion active per token. It supports a 256K-token context natively (262,144 tokens) and up to 1 million tokens using YaRN extrapolation.

How does Qwen3-Coder compare to Claude Sonnet 4?

Alibaba positions Qwen3-Coder as state of the art among open models for agentic coding, browser-use, and tool-use, comparable to Claude Sonnet 4. On SWE-bench Verified it scores 67.0% standalone and 69.6% with the OpenHands scaffold at 500 turns, close to Claude Sonnet 4's reported figures.

How can I use Qwen3-Coder?

You can run the open weights locally (a smaller 30B-A3B variant exists for lighter hardware), call it through Alibaba Cloud Model Studio, OpenRouter, Together AI, or Ollama, and drive it with the open-source Qwen Code CLI, Claude Code (via a router), or Cline.