Qwen (Alibaba) · 2026-04-22 · seismic

Qwen3.6-27B — Flagship-Level Coding in a 27B Dense Open-Weights Model

Item: Qwen3.6-27B — Flagship-Level Coding in a 27B Dense Open-Weights Model
Rating: 5
Author: AI/TLDR

Qwen3.6-27B scores 77.2% SWE-bench Verified at 27B dense parameters — runnable locally. Apache 2.0, multimodal (text + image + video), 262k context (1M with YaRN), hybrid DeltaNet-attention architecture.

A 27B dense model scoring 77.2% SWE-bench Verified — Apache 2.0, multimodal, runnable on consumer hardware.

Key specs

Parameters	27B
Context window	262k tokens (1M with YaRN)
SWE-bench	77.2%
GPQA	87.8%
Swe bench pro	53.5%
Aime 2026	94.1%
Live code bench v6	83.9%
Mmmu	82.9%
Video mme (w/ subtitles)	87.7%
Hn points	215

What is it?

Qwen3.6-27B is the first dense model in the Qwen3.6 family from Alibaba's Qwen team, released April 22, 2026. The previously released 35B-A3B variant is a sparse MoE; this is a 27B fully-dense model. It scores 77.2% on SWE-bench Verified and 94.1% on AIME 2026. The architecture handles text, images, and video natively. Apache 2.0 license.

How does it work?

The model alternates Gated DeltaNet layers (linear attention) with standard Gated Attention layers — each group of four transformer layers has three DeltaNet blocks followed by one standard attention block. DeltaNet's delta-rule update mechanism processes long sequences more efficiently than full quadratic attention. A new 'preserve thinking' option lets the model carry reasoning traces from prior turns across a multi-turn agentic loop, reducing redundant replanning. Native context is 262,144 tokens, extensible to 1M with YaRN.

Why does it matter?

At 27B parameters it runs on a single consumer GPU with quantization (Q4/Q8), while matching or exceeding prior-generation 70B-class models on coding benchmarks. For teams that cannot send source code to an external API, this raises the practical ceiling for self-hosted coding agents. Multimodal video input is rare in open-weights models at this scale.

Who is it for?

Self-hosters; teams running local coding agents on consumer GPUs; developers needing open-weights multimodal reasoning.

Try it

huggingface.co/Qwen/Qwen3.6-27B — also via vLLM ≥0.19.0, SGLang ≥0.5.10, or Ollama