AI/TLDR

Qwen3-Max

Alibaba's trillion-parameter flagship for coding and agentic tasks

Overview

Qwen3-Max is the largest model in Alibaba's Qwen family, crossing the 1-trillion-parameter mark with a Mixture-of-Experts design that activates only a fraction of those parameters per token. It launched in September 2025 — first as a preview on September 5, then as a full production release later that month — and is positioned as the highest-performing tier of the Qwen-Max line for complex, multi-step tasks. Unlike most of Alibaba's Qwen releases, Qwen3-Max is closed-weight: it runs only through Alibaba Cloud's Model Studio API, Qwen Chat, and third-party gateways such as OpenRouter, with no local deployment.

The model was pretrained on roughly 36 trillion tokens (about twice the data used for Qwen2.5) with a multilingual, coding, and STEM/reasoning emphasis. It ships in two flavors: Qwen3-Max-Instruct, a fast non-thinking model aimed at coding and agentic tool use, and Qwen3-Max-Thinking, a reasoning variant that can call a code interpreter and use parallel test-time compute. Through the API it exposes a 262,144-token context window and up to 32,768 output tokens.

Qwen3-Max is a text-only model — vision tasks in the Qwen3 generation are handled by separate Qwen3-VL models. On release, the preview version of Qwen3-Max reached the global top three on the LMArena text leaderboard, and Alibaba reported strong day-one results on coding and agentic benchmarks. Treat the most aggressive math claims as vendor-reported rather than independently verified.

Released2025-09
LicenseProprietary (closed weights), API access via Alibaba Cloud Model Studio
WeightsAPI only
ParametersOver 1 trillion (MoE, sparse activation)
Context262K
Max output32K
ArchitectureSparse Mixture-of-Experts (MoE) transformer that activates a subset of its 1T+ parameters per token. Pretrained on roughly 36 trillion tokens (about 2x Qwen2.5). Shipped in two variants: Qwen3-Max-Instruct (no extended reasoning) and Qwen3-Max-Thinking (tool-augmented reasoning mode).
Knowledge cutoffNot officially disclosed
ModalitiesText
StatusAvailable

Benchmarks

  1. SWE-bench Verified (Qwen3-Max-Instruct)69.6%
  2. Tau2-bench (Qwen3-Max-Instruct, agentic tool-calling)74.8%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$1.20 / 1M tokens (0-32K input tier; rises to $2.40 at 32K-128K and $3.00 at 128K-252K) per 1M tokens
Output$6.00 / 1M tokens (0-32K tier; up to $15.00 at higher tiers) per 1M tokens

Tiered by input length on Alibaba Cloud Model Studio (International deployment). Batch invocation is charged at 50% of real-time pricing. Global-deployment pricing is lower.

Pricing source ↗

Strengths

  • Trillion-parameter MoE scale with efficient sparse activation per token
  • Strong software-engineering performance (69.6 on SWE-bench Verified, Instruct variant)
  • Top-tier agentic tool-calling (74.8 on Tau2-bench, Instruct variant)
  • Large 262K-token context window for long documents and codebases
  • Two variants — fast Instruct and reasoning-focused Thinking — under one model line
  • Native function calling, structured JSON output, and web-search support via Model Studio

Best for

  • Agentic workflows and multi-tool orchestration
  • Software engineering and code generation across large repositories
  • Long-context document analysis and summarization
  • Complex multi-step reasoning and STEM problem solving
  • Multilingual text generation and assistance
  • Building API-driven assistants with structured output and function calling

How to access

ProviderModel ID
Alibaba Cloud Model Studio ↗qwen3-max
OpenRouter ↗qwen/qwen3-max

Qwen-Max — every version

The full lineage of the Qwen-Max line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Qwen3.7-Maxcurrent2026-05Proprietary
Qwen3-Max2025-09Proprietary
Qwen2.5-Max2025-01-29Proprietary

FAQ

Is Qwen3-Max open source?

No. Unlike many other Qwen models, Qwen3-Max is closed-weight. It is available only through APIs — Alibaba Cloud Model Studio, Qwen Chat, and gateways like OpenRouter — and does not support local deployment.

How many parameters does Qwen3-Max have?

Alibaba states Qwen3-Max crosses the 1-trillion-parameter mark using a Mixture-of-Experts (MoE) design, so only a sparse subset of those parameters is activated per token. A full technical report with the exact figure was not published at launch.

What is the context window of Qwen3-Max?

Through Alibaba Cloud Model Studio, Qwen3-Max supports a 262,144-token context window (about 262K) and up to 32,768 output tokens.

Does Qwen3-Max support images or video?

No. Qwen3-Max is a text-only model. Vision and video tasks in the Qwen3 generation are handled by separate Qwen3-VL models, not by Qwen3-Max.