Qwen (Alibaba) · 2026-04-22 · seismic
Qwen3.6-27B — Flagship-Level Coding in a 27B Dense Open-Weights Model
Qwen3.6-27B scores 77.2% SWE-bench Verified at 27B dense parameters — runnable locally. Apache 2.0, multimodal (text + image + video), 262k context (1M with YaRN), hybrid DeltaNet-attention architecture.

A 27B dense model scoring 77.2% SWE-bench Verified — Apache 2.0, multimodal, runnable on consumer hardware.
Key specs
| Parameters | 27B |
|---|---|
| Context window | 262k tokens (1M with YaRN) |
| SWE-bench | 77.2% |
| GPQA | 87.8% |
| Swe bench pro | 53.5% |
| Aime 2026 | 94.1% |
| Live code bench v6 | 83.9% |
| Mmmu | 82.9% |
| Video mme (w/ subtitles) | 87.7% |
| Hn points | 215 |
What is it?
Qwen3.6-27B is the first dense model in the Qwen3.6 family from Alibaba's Qwen team, released April 22, 2026. The previously released 35B-A3B variant is a sparse MoE; this is a 27B fully-dense model. It scores 77.2% on SWE-bench Verified and 94.1% on AIME 2026. The architecture handles text, images, and video natively. Apache 2.0 license.
How does it work?
The model alternates Gated DeltaNet layers (linear attention) with standard Gated Attention layers — each group of four transformer layers has three DeltaNet blocks followed by one standard attention block. DeltaNet's delta-rule update mechanism processes long sequences more efficiently than full quadratic attention. A new 'preserve thinking' option lets the model carry reasoning traces from prior turns across a multi-turn agentic loop, reducing redundant replanning. Native context is 262,144 tokens, extensible to 1M with YaRN.
Why does it matter?
At 27B parameters it runs on a single consumer GPU with quantization (Q4/Q8), while matching or exceeding prior-generation 70B-class models on coding benchmarks. For teams that cannot send source code to an external API, this raises the practical ceiling for self-hosted coding agents. Multimodal video input is rare in open-weights models at this scale.
Who is it for?
Self-hosters; teams running local coding agents on consumer GPUs; developers needing open-weights multimodal reasoning.
Try it
huggingface.co/Qwen/Qwen3.6-27B — also via vLLM ≥0.19.0, SGLang ≥0.5.10, or Ollama