Fireship · 2026-04-06 · notable
Qwen 3.6 Plus tested — open-source beats Opus 4.5 and Gemini 3 (YouTube review)
A side-by-side YouTube test of Qwen 3.6 Plus against Claude Opus 4.5 and Gemini 3 on coding, reasoning, and agent tasks — the free, open-source model wins enough of them to have kicked off the Qwen 3.6 hype cycle.

The benchmarking video that helped start the 'Qwen 3.6 actually beats the closed frontier on some tasks' conversation.
What is it?
A ~2-weeks-old YouTube review video that ran Qwen 3.6 Plus head-to-head with Claude Opus 4.5 and Gemini 3 across a battery of coding, reasoning, and agent-style tasks. It's one of the videos that pushed the 'wait, this open model is actually competitive' reaction and set up Simon Willison's later pelican post.
How does it work?
The host hits each model with the same prompts — a mix of tool-using coding tasks, multi-step reasoning, SVG generation, and long-context summarization — and scores the outputs qualitatively with rationale. Qwen 3.6 Plus, running via OpenRouter, wins a surprising number of them outright; the rest it matches cheaply.
Why does it matter?
Written reviews of new models pile up fast; side-by-side videos give a fairer read on real model quality because you can see the model's reply latency, tool calls, and failure modes. This one is worth watching specifically because the result — a free open model credibly matching the closed frontier on multiple real tasks — has direct implications for anyone currently paying for Opus or Gemini.
Who is it for?
Teams deciding whether to swap an expensive closed model for an open one.
Try it
https://openrouter.ai/qwen/qwen3.6-plus-preview