AI/TLDR

Alibaba / Qwen · 2026-04-17 · major

Qwen3.6-35B-A3B — 35B MoE Coding Model, 3B Active Params, SWE-bench 73.4%

Alibaba open-sources a 35B MoE model activating only 3B parameters per token. Scores 73.4% SWE-bench Verified and 86.0% GPQA. Fits in ~20 GB locally. Apache 2.0.

Qwen3.6-35B-A3B — Alibaba Qwen MoE coding model logo

Alibaba's 35B MoE model uses only 3B active params, scores 73.4% SWE-bench Verified, and runs locally on a MacBook Pro.

Key specs

LicenseApache 2.0
Active params3B
Context window262K tokens
SWE-bench73.4%
GPQA86.0%
Total parameters35B
Aime 202692.7%
Live code bench v680.4%

What is it?

Qwen3.6-35B-A3B is a sparse Mixture-of-Experts model released under Apache 2.0. With 256 experts and only 3B parameters activated per forward pass, it delivers coding and reasoning performance comparable to dense models far larger than its active size. It supports text, image, and video inputs with a native 262K-token context window extensible to ~1M tokens via YaRN.

How does it work?

The model uses Gated DeltaNet (a hybrid attention variant) combined with a 256-expert MoE feed-forward layer. A preserve_thinking flag retains reasoning traces across multi-turn agent conversations, reducing redundant re-reasoning steps. Multi-Token Prediction enables speculative decoding for higher throughput. Weights in BF16 run locally via vLLM, SGLang, or LM Studio.

Why does it matter?

At 3B active parameters, inference cost is a fraction of comparable dense models. Scoring 73.4% on SWE-bench Verified puts it in frontier coding territory while running locally. The preserve_thinking feature and native tool-calling reduce prompt engineering overhead for agentic coding workflows.

Who is it for?

Developers building coding agents, teams wanting frontier-class reasoning locally, agentic pipelines needing long context at low inference cost.

Try it

huggingface-cli download Qwen/Qwen3.6-35B-A3B  # or: qwen3.6-flash on Alibaba Cloud Model Studio

Sources · 3 outlets

Tags

  • open-source
  • moe
  • coding
  • agentic
  • apache-2-0
  • multimodal
  • swe-bench
  • local-inference
  • qwen

← All releases · Learn AI