Qwen3-Max

Alibaba's trillion-parameter flagship for coding and agentic tasks

Overview

Qwen3-Max is the largest model in Alibaba's Qwen family, crossing the 1-trillion-parameter mark with a Mixture-of-Experts design that activates only a fraction of those parameters per token. It launched in September 2025 — first as a preview on September 5, then as a full production release later that month — and is positioned as the highest-performing tier of the Qwen-Max line for complex, multi-step tasks. Unlike most of Alibaba's Qwen releases, Qwen3-Max is closed-weight: it runs only through Alibaba Cloud's Model Studio API, Qwen Chat, and third-party gateways such as OpenRouter, with no local deployment.

The model was pretrained on roughly 36 trillion tokens (about twice the data used for Qwen2.5) with a multilingual, coding, and STEM/reasoning emphasis. It ships in two flavors: Qwen3-Max-Instruct, a fast non-thinking model aimed at coding and agentic tool use, and Qwen3-Max-Thinking, a reasoning variant that can call a code interpreter and use parallel test-time compute. Through the API it exposes a 262,144-token context window and up to 32,768 output tokens.

Qwen3-Max is a text-only model — vision tasks in the Qwen3 generation are handled by separate Qwen3-VL models. On release, the preview version of Qwen3-Max reached the global top three on the LMArena text leaderboard, and Alibaba reported strong day-one results on coding and agentic benchmarks. Treat the most aggressive math claims as vendor-reported rather than independently verified.

Released	2025-09
License	Proprietary (closed weights), API access via Alibaba Cloud Model Studio
Weights	API only
Parameters	Over 1 trillion (MoE, sparse activation)
Context	262K
Max output	32K
Architecture	Sparse Mixture-of-Experts (MoE) transformer that activates a subset of its 1T+ parameters per token. Pretrained on roughly 36 trillion tokens (about 2x Qwen2.5). Shipped in two variants: Qwen3-Max-Instruct (no extended reasoning) and Qwen3-Max-Thinking (tool-augmented reasoning mode).
Knowledge cutoff	Not officially disclosed
Modalities	Text
Status	Available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$1.20 / 1M tokens (0-32K input tier; rises to $2.40 at 32K-128K and $3.00 at 128K-252K) per 1M tokens
Output	$6.00 / 1M tokens (0-32K tier; up to $15.00 at higher tiers) per 1M tokens

Tiered by input length on Alibaba Cloud Model Studio (International deployment). Batch invocation is charged at 50% of real-time pricing. Global-deployment pricing is lower.

Pricing source ↗

Strengths

Trillion-parameter MoE scale with efficient sparse activation per token
Strong software-engineering performance (69.6 on SWE-bench Verified, Instruct variant)
Top-tier agentic tool-calling (74.8 on Tau2-bench, Instruct variant)
Large 262K-token context window for long documents and codebases
Two variants — fast Instruct and reasoning-focused Thinking — under one model line
Native function calling, structured JSON output, and web-search support via Model Studio

Best for

Agentic workflows and multi-tool orchestration
Software engineering and code generation across large repositories
Long-context document analysis and summarization
Complex multi-step reasoning and STEM problem solving
Multilingual text generation and assistance
Building API-driven assistants with structured output and function calling

How to access

Provider	Model ID
Alibaba Cloud Model Studio ↗	`qwen3-max`
OpenRouter ↗	`qwen/qwen3-max`

Qwen-Max — every version

The full lineage of the Qwen-Max line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Qwen3.7-Maxcurrent	2026-05	—	Proprietary
Qwen3-Max	2025-09	—	Proprietary
Qwen2.5-Max	2025-01-29	—	Proprietary

FAQ

Is Qwen3-Max open source?

No. Unlike many other Qwen models, Qwen3-Max is closed-weight. It is available only through APIs — Alibaba Cloud Model Studio, Qwen Chat, and gateways like OpenRouter — and does not support local deployment.

How many parameters does Qwen3-Max have?

Alibaba states Qwen3-Max crosses the 1-trillion-parameter mark using a Mixture-of-Experts (MoE) design, so only a sparse subset of those parameters is activated per token. A full technical report with the exact figure was not published at launch.

What is the context window of Qwen3-Max?

Through Alibaba Cloud Model Studio, Qwen3-Max supports a 262,144-token context window (about 262K) and up to 32,768 output tokens.

Does Qwen3-Max support images or video?

No. Qwen3-Max is a text-only model. Vision and video tasks in the Qwen3 generation are handled by separate Qwen3-VL models, not by Qwen3-Max.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Qwen-Max — every version

// FAQ