Qwen3 (2507 update)

Refreshed open-weight Qwen3 checkpoints that split into dedicated Instruct and Thinking variants, push context to ~1M tokens, and ship under Apache 2.0.

Overview

Qwen3 (2507 update) is the July 2025 refresh of Alibaba's open-weight Qwen3 line. Rather than keep Qwen3's single hybrid 'thinking/non-thinking' switch, the 2507 release splits each model into two dedicated checkpoints: an Instruct variant tuned for fast, direct answers and a Thinking variant that always reasons step-by-step. The headline models are the large Mixture-of-Experts Qwen3-235B-A22B-Instruct-2507 and Qwen3-235B-A22B-Thinking-2507, alongside the smaller, deployable Qwen3-30B-A3B-Instruct-2507 and Qwen3-30B-A3B-Thinking-2507.

The 235B flagship activates 22B of its 235B parameters per token, using 128 experts with 8 active across 94 layers. All four 2507 checkpoints are released under the permissive Apache 2.0 license with downloadable weights on Hugging Face and ModelScope, and run on vLLM, SGLang, Ollama, and LM Studio. Native context is 262,144 tokens, extendable to roughly 1,010,000 (~1M) tokens with the long-context configuration.

Compared with the original April 2025 Qwen3, the 2507 update raises scores on knowledge, math, coding, and agentic benchmarks and improves long-tail multilingual coverage. The Thinking variant is a strong open-weight reasoning model: it reports 92.3 on AIME25 and 74.1 on LiveCodeBench v6, while the Instruct variant favors speed at 70.3 on AIME25 and 83.0 on MMLU-Pro.

Released	2025-07
License	Apache-2.0
Weights	Open weights
Parameters	235B total · 22B active (also 30B-A3B)
Context	262K (extendable to 1M)
Max output	81,920 tokens (Thinking)
Architecture	Mixture-of-Experts (128 experts, 8 active)
Modalities	Text
Status	Generally available

Benchmarks

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input	$0.09 / 1M tokens
Output	$0.10 / 1M tokens

OpenRouter list price for Qwen3-235B-A22B-Instruct-2507; the Thinking variant lists at $0.10 in / $0.10 out. Weights are free to self-host under Apache 2.0.

Pricing source ↗

Strengths

Permissive Apache 2.0 license with fully downloadable weights — free for commercial use and self-hosting
Separate Instruct and Thinking checkpoints, so you pick speed or deep reasoning instead of toggling a hybrid switch
Mixture-of-Experts design activates only 22B of 235B parameters per token, keeping inference cost low for the flagship
Native 262K context extendable to ~1M tokens for long-document and large-codebase work
Strong open-weight reasoning: 92.3 AIME25 and 74.1 LiveCodeBench v6 on the 235B Thinking variant
Smaller 30B-A3B variants are deployable on a single GPU or high-memory workstation via Ollama and LM Studio

Best for

Self-hosted assistants and agents where an open Apache 2.0 license and on-prem control matter
Math, science, and competition-style reasoning with the Thinking variant
Coding and code-generation workflows benchmarked on LiveCodeBench
Long-context tasks: large-document analysis, retrieval-augmented pipelines, and whole-repo reasoning up to ~1M tokens
Tool use and agentic workflows that need reliable instruction following
Cost-sensitive deployments using the 30B-A3B variants on modest hardware

How to access

Provider	Model ID
OpenRouter ↗	`qwen/qwen3-235b-a22b-2507`
OpenRouter (Thinking) ↗	`qwen/qwen3-235b-a22b-thinking-2507`
Together AI ↗	—
Hugging Face (weights) ↗	—

Qwen (open-weight) — every version

The full lineage of the Qwen (open-weight) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

Version	Released	Context	License
Qwen3.6current	2026-04	—	Apache-2.0
Qwen3.5	2026-02-16	—	Apache-2.0
Qwen3 (2507 update)	2025-07	—	Apache-2.0
Qwen3	2025-04-28	—	Apache-2.0
Qwen2.5	2024-09	—	Apache-2.0
Qwen2	2024-06	—	Apache-2.0

FAQ

What is the Qwen3 2507 update?

It is Alibaba's July 2025 refresh of the open-weight Qwen3 line. Instead of one hybrid model, it ships separate Instruct (fast, direct) and Thinking (always reasoning) checkpoints, headlined by Qwen3-235B-A22B-Instruct-2507 and Qwen3-235B-A22B-Thinking-2507, plus smaller Qwen3-30B-A3B-2507 variants. All are Apache 2.0 with downloadable weights.

What is the difference between the Instruct and Thinking variants?

The Instruct variants answer directly and do not emit reasoning blocks, favoring speed and instruction following. The Thinking variants always reason step-by-step before answering and score much higher on hard math and coding benchmarks — for example 92.3 vs 70.3 on AIME25 and 74.1 vs 51.8 on LiveCodeBench v6 for the 235B models.

How long is the context window?

The 2507 checkpoints support 262,144 tokens natively, extendable to roughly 1,010,000 (~1M) tokens using the long-context configuration documented on the model cards.

Is Qwen3 2507 free and open-weight?

Yes. All four 2507 checkpoints are released under the permissive Apache 2.0 license with weights available on Hugging Face and ModelScope, so they can be self-hosted and used commercially for free. Hosted API access (for example via OpenRouter) lists around $0.09–$0.10 per 1M input/output tokens for the 235B models.

// Overview

// Benchmarks

// Pricing

// Strengths

// Best for

// How to access

// Qwen (open-weight) — every version

// FAQ