Two Minute Papers · 2026-06-16 · notable
Two Minute Papers: 'They Looked Inside Claude's AI's Mind. It Got Weird'
Károly Zsolnai-Fehér's new Two Minute Papers video walks through the latest mechanistic interpretability work on Claude — what Anthropic's researchers found by probing the model's internal features.

A fresh Two Minute Papers explainer on Anthropic's mechanistic interpretability work inside Claude.
What is it?
A new Two Minute Papers video on YouTube. Károly Zsolnai-Fehér summarizes interpretability research that probes Claude's internal features — what neurons and circuits activate, and what surprised the researchers along the way.
How does it work?
The video follows the channel's signature short-form format: paper figures, animated explanations, and on-screen captions over a few minutes. Zsolnai-Fehér frames the findings as a non-researcher introduction to mechanistic interpretability work on a frontier model.
Why does it matter?
Two Minute Papers is one of the largest research-explainer channels on YouTube. A fresh explainer like this is often the first place a broad audience hears about interpretability research, and it shapes how the public reads Anthropic's safety positioning during the current Fable 5 / Mythos 5 suspension news cycle.