Qwen2.5-Max

Alibaba's large-scale MoE flagship that took on DeepSeek V3 and GPT-4o

Overview

Qwen2.5-Max is the flagship model of Alibaba's Qwen2.5 generation, unveiled on January 29, 2025. It is a large-scale Mixture-of-Experts (MoE) model pretrained on more than 20 trillion tokens and post-trained with supervised fine-tuning and RLHF. Unlike the open-weight Qwen2.5 dense models (0.5B to 72B, released under Apache 2.0), Qwen2.5-Max was kept proprietary and offered only through Alibaba Cloud's API and the Qwen Chat web app.

The launch landed in the middle of the DeepSeek V3 moment, and Alibaba positioned Qwen2.5-Max directly against it. On the chat-instruct benchmarks Alibaba published, Qwen2.5-Max edged out DeepSeek V3 on Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond, and traded blows with GPT-4o and Claude 3.5 Sonnet on knowledge tasks like MMLU-Pro. It was, for a window in early 2025, one of the strongest non-reasoning Chinese models available.

Today Qwen2.5-Max is a historical checkpoint rather than Alibaba's headline model. Its API snapshot, qwen-max-2025-01-25, remains reachable through Alibaba Cloud Model Studio and resellers such as OpenRouter, but Alibaba's flagship line has since moved on through the Qwen3 and Qwen3-Max generations. Treat it as a capable, mid-2020s general-purpose chat model that has been superseded.

Released	2025-01-29
License	Proprietary (closed weights). Unlike the smaller Apache-2.0 open-weight Qwen2.5 models, the Max tier is API-only.
Weights	API only
Parameters	Not disclosed (Mixture-of-Experts; total parameter count never published by Alibaba)
Context	~32K tokens
Max output	Not officially disclosed
Architecture	Large-scale Mixture-of-Experts (MoE) transformer, pretrained on over 20 trillion tokens, then aligned with Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).
Knowledge cutoff	Not officially disclosed
Modalities	text input, text output
Status	Available via API as the qwen-max-2025-01-25 snapshot, but superseded as Alibaba's flagship by the later Qwen3-Max line.

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$1.60 / 1M tokens per 1M tokens (USD, international qwen-max)
Output	$6.40 / 1M tokens per 1M tokens (USD, international qwen-max)

Pricing for the international qwen-max deployment on Alibaba Cloud Model Studio. Mainland China deployment of the qwen-max-2025-01-25 snapshot is billed in RMB at lower equivalent rates.

Pricing source ↗

Strengths

Strong instruction-following and chat quality for its era — led DeepSeek V3 on Arena-Hard (89.4 vs 85.5)
Competitive general knowledge and reasoning, scoring 76.1 on MMLU-Pro, close to GPT-4o and Claude 3.5 Sonnet
Solid multilingual coverage, including strong Chinese-language performance, from 20T+ token pretraining
Mixture-of-Experts architecture delivers flagship-tier quality with efficient per-token inference
OpenAI-compatible API on Alibaba Cloud makes it a low-friction drop-in for existing tooling

Best for

General-purpose chat assistants and Q&A in English and Chinese
Multilingual content generation, summarization, and rewriting
Code generation and completion for everyday programming tasks
Knowledge-heavy reasoning and analysis at non-reasoning-model latency
A China-region alternative to GPT-4o-class proprietary APIs

How to access

Provider	Model ID
Alibaba Cloud Model Studio ↗	`qwen-max-2025-01-25`
OpenRouter ↗	`qwen/qwen-max-2025-01-25`

Qwen-Max — every version

The full lineage of the Qwen-Max line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Qwen3.7-Maxcurrent	2026-05	—	Proprietary
Qwen3-Max	2025-09	—	Proprietary
Qwen2.5-Max	2025-01-29	—	Proprietary

FAQ

Is Qwen2.5-Max open source?

No. Qwen2.5-Max is proprietary and closed-weight, available only through Alibaba Cloud's API and the Qwen Chat app. This differs from the smaller Qwen2.5 dense models (0.5B-72B), which Alibaba released as open weights under the Apache 2.0 license.

How does Qwen2.5-Max compare to DeepSeek V3?

On the chat-instruct benchmarks Alibaba published, Qwen2.5-Max edged out DeepSeek V3 on Arena-Hard (89.4 vs 85.5), LiveBench (62.2 vs 60.5), LiveCodeBench (38.7 vs 37.6), GPQA-Diamond (60.1 vs 59.1), and MMLU-Pro (76.1 vs 75.9). The two were close competitors in early 2025.

How much does the Qwen2.5-Max API cost?

On the international Alibaba Cloud Model Studio deployment, qwen-max is priced at $1.60 per 1M input tokens and $6.40 per 1M output tokens. The mainland China deployment of the qwen-max-2025-01-25 snapshot is billed in RMB at lower equivalent rates.

Is Qwen2.5-Max still Alibaba's best model?

No. The qwen-max-2025-01-25 snapshot is still reachable via API, but Alibaba's flagship line has since advanced through the Qwen3 and Qwen3-Max generations, which outperform it. Qwen2.5-Max is now best treated as a superseded checkpoint.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Qwen-Max — every version

// FAQ