Qwen · 2026-04-30 · notable
Qwen-Scope — Open Sparse-Autoencoder Suite for Steering Qwen3 and Qwen3.5 Without Prompt Engineering
Alibaba's Qwen team open-sources 14 SAE checkpoints across 7 Qwen3/3.5 variants, with applications spanning inference steering, benchmark analysis, toxicity classification, and post-training cleanup.

An open SAE toolkit that turns Qwen3/3.5 internal features into knobs you can twist to steer outputs and audit benchmarks.
Key specs
| Sae checkpoints | 14 |
|---|---|
| Model variants | 7 |
| Top k | 50 or 100 |
| Sae width dense | 16x hidden size |
| Sae width mo e | 32K to 128K (16x to 64x expansion) |
| Code switching reduction | over 50% |
| Safety feature coverage | 99.74% |
| Benchmark spearman | 0.85 |
What is it?
Qwen-Scope is a research-grade suite of sparse autoencoders (SAEs) trained on the residual streams of seven Qwen3 and Qwen3.5 variants — from 1.7B up to the 35B-A3B MoE — released openly on Hugging Face with a companion paper and demo Space.
How does it work?
Each SAE uses a Top-k activation rule (k = 50 or 100) over an overcomplete latent basis. For dense backbones the SAE width is 16× the hidden size; for MoE backbones widths scale from 32K up to 128K (16×–64× expansion). The team ships four reference workflows: inference-time steering by suppressing or amplifying latent features, SAE-based benchmark analysis (~0.85 Spearman with performance-based redundancy across 17 benchmarks), data-centric workflows for toxicity classifiers and safety-data synthesis (99.74% feature coverage), and post-training fixes that cut code-switching ratios by over 50%.
Why does it matter?
SAEs have been a mostly-academic curiosity tied to small models from Anthropic and a few labs. Shipping production-scale SAEs for a popular open model family — with concrete recipes for steering, eval analysis, and RL/SFT cleanup — moves interpretability from interpretability-paper territory into the hands of anyone shipping a Qwen3 fine-tune.
Who is it for?
ML researchers, alignment teams, Qwen fine-tuners
Try it
https://huggingface.co/collections/Qwen/qwen-scope