AI/TLDR

Qwen3 (2507 update)

Refreshed open-weight Qwen3 checkpoints that split into dedicated Instruct and Thinking variants, push context to ~1M tokens, and ship under Apache 2.0.

Overview

Qwen3 (2507 update) is the July 2025 refresh of Alibaba's open-weight Qwen3 line. Rather than keep Qwen3's single hybrid 'thinking/non-thinking' switch, the 2507 release splits each model into two dedicated checkpoints: an Instruct variant tuned for fast, direct answers and a Thinking variant that always reasons step-by-step. The headline models are the large Mixture-of-Experts Qwen3-235B-A22B-Instruct-2507 and Qwen3-235B-A22B-Thinking-2507, alongside the smaller, deployable Qwen3-30B-A3B-Instruct-2507 and Qwen3-30B-A3B-Thinking-2507.

The 235B flagship activates 22B of its 235B parameters per token, using 128 experts with 8 active across 94 layers. All four 2507 checkpoints are released under the permissive Apache 2.0 license with downloadable weights on Hugging Face and ModelScope, and run on vLLM, SGLang, Ollama, and LM Studio. Native context is 262,144 tokens, extendable to roughly 1,010,000 (~1M) tokens with the long-context configuration.

Compared with the original April 2025 Qwen3, the 2507 update raises scores on knowledge, math, coding, and agentic benchmarks and improves long-tail multilingual coverage. The Thinking variant is a strong open-weight reasoning model: it reports 92.3 on AIME25 and 74.1 on LiveCodeBench v6, while the Instruct variant favors speed at 70.3 on AIME25 and 83.0 on MMLU-Pro.

Released2025-07
LicenseApache-2.0
WeightsOpen weights
Parameters235B total · 22B active (also 30B-A3B)
Context262K (extendable to 1M)
Max output81,920 tokens (Thinking)
ArchitectureMixture-of-Experts (128 experts, 8 active)
ModalitiesText
StatusGenerally available

Benchmarks

  1. MMLU-Pro (Thinking)84.4%
  2. MMLU-Pro (Instruct)83%
  3. MMLU-Redux (Instruct)93.1%
  4. GPQA (Thinking)81.1%
  5. GPQA (Instruct)77.5%
  6. AIME25 (Thinking)92.3%
  7. AIME25 (Instruct)70.3%
  8. LiveCodeBench v6 (Thinking)74.1%
  9. LiveCodeBench v6 (Instruct)51.8%
  10. Arena-Hard v2 (Instruct)79.2%
  11. SuperGPQA (Thinking)64.9%

Scores on a 0–100 scale (25-point gridlines); higher is better. Each benchmark links to its published source.

Pricing

Input$0.09 / 1M tokens
Output$0.10 / 1M tokens

OpenRouter list price for Qwen3-235B-A22B-Instruct-2507; the Thinking variant lists at $0.10 in / $0.10 out. Weights are free to self-host under Apache 2.0.

Pricing source ↗

Strengths

  • Permissive Apache 2.0 license with fully downloadable weights — free for commercial use and self-hosting
  • Separate Instruct and Thinking checkpoints, so you pick speed or deep reasoning instead of toggling a hybrid switch
  • Mixture-of-Experts design activates only 22B of 235B parameters per token, keeping inference cost low for the flagship
  • Native 262K context extendable to ~1M tokens for long-document and large-codebase work
  • Strong open-weight reasoning: 92.3 AIME25 and 74.1 LiveCodeBench v6 on the 235B Thinking variant
  • Smaller 30B-A3B variants are deployable on a single GPU or high-memory workstation via Ollama and LM Studio

Best for

  • Self-hosted assistants and agents where an open Apache 2.0 license and on-prem control matter
  • Math, science, and competition-style reasoning with the Thinking variant
  • Coding and code-generation workflows benchmarked on LiveCodeBench
  • Long-context tasks: large-document analysis, retrieval-augmented pipelines, and whole-repo reasoning up to ~1M tokens
  • Tool use and agentic workflows that need reliable instruction following
  • Cost-sensitive deployments using the 30B-A3B variants on modest hardware

How to access

ProviderModel ID
OpenRouter ↗qwen/qwen3-235b-a22b-2507
OpenRouter (Thinking) ↗qwen/qwen3-235b-a22b-thinking-2507
Together AI ↗
Hugging Face (weights) ↗

Qwen (open-weight) — every version

The full lineage of the Qwen (open-weight) line, newest first. Every version has its own page — click any to compare specs, benchmarks and pricing.

VersionReleasedContextLicense
Qwen3.6current2026-04Apache-2.0
Qwen3.52026-02-16Apache-2.0
Qwen3 (2507 update)2025-07Apache-2.0
Qwen32025-04-28Apache-2.0
Qwen2.52024-09Apache-2.0
Qwen22024-06Apache-2.0

FAQ

What is the Qwen3 2507 update?

It is Alibaba's July 2025 refresh of the open-weight Qwen3 line. Instead of one hybrid model, it ships separate Instruct (fast, direct) and Thinking (always reasoning) checkpoints, headlined by Qwen3-235B-A22B-Instruct-2507 and Qwen3-235B-A22B-Thinking-2507, plus smaller Qwen3-30B-A3B-2507 variants. All are Apache 2.0 with downloadable weights.

What is the difference between the Instruct and Thinking variants?

The Instruct variants answer directly and do not emit reasoning blocks, favoring speed and instruction following. The Thinking variants always reason step-by-step before answering and score much higher on hard math and coding benchmarks — for example 92.3 vs 70.3 on AIME25 and 74.1 vs 51.8 on LiveCodeBench v6 for the 235B models.

How long is the context window?

The 2507 checkpoints support 262,144 tokens natively, extendable to roughly 1,010,000 (~1M) tokens using the long-context configuration documented on the model cards.

Is Qwen3 2507 free and open-weight?

Yes. All four 2507 checkpoints are released under the permissive Apache 2.0 license with weights available on Hugging Face and ModelScope, so they can be self-hosted and used commercially for free. Hosted API access (for example via OpenRouter) lists around $0.09–$0.10 per 1M input/output tokens for the 235B models.