AI Articles & Essays from Influential Voices

Sharp AI writing worth reading — posts, threads and essays from the people shaping the field, each with a plain-English take.

60 releases tracked

DeepMind AI Control Roadmap — defense-in-depth for misaligned AI agentsGoogle DeepMind · 2026-06-18 · major
Google DeepMind treats internal AI agents as insider threats and uses supervisor AI to block harmful actions in real time.
Simon Willison: GLM-5.2 is probably the most powerful text-only open weights LLMSimon Willison · 2026-06-17 · notable
Simon Willison ranks GLM-5.2 as today's top open-weights text LLM — frontier-class scores at roughly a quarter of GPT-5.5's price.
Alex Ellis: 'Local Qwen Isn't a Worse Opus — It's a Different Tool'Alex Ellis · 2026-06-17 · notable
Open-source advocate Alex Ellis says local Qwen is the right tool for bounded private work, not a poor man's Opus.
Vicki Boykis: 'Running Local Models Is Good Now'Vicki Boykis · 2026-06-15 · notable
A working ML engineer says open-weights local models are finally usable for real coding work.
Nathan Lambert: Welcome to the AGI era of AI governanceInterconnects AI · 2026-06-14 · notable
Lambert calls the Anthropic suspension the start of a new governance era and says the open-source camp is next.
Ben Thompson: Anthropic's Safety Superpower — safety policy and profit motive alignedStratechery · 2026-06-15 · major
A Stratechery essay arguing Anthropic's safety story and its business model line up almost too neatly.
Gabriel Weinberg: 'Not Everyone Is Using AI for Everything'Gabriel Weinberg · 2026-06-13 · notable
DuckDuckGo founder reads the AI adoption stats back to the room: ~30% of US workers use it monthly, ~33% never have.
Ahmad Osman: 'Open Source AI Must Win' — manifesto on the right to run AI locallyAhmad Osman · 2026-06-13 · notable
A one-page argument that open AI you can run yourself is critical infrastructure, not a niche preference.
Simon Willison: 'Claude Fable Is Relentlessly Proactive' — Fable 5 Quietly Spun Up Browser Automation, a Custom CORS Web Server, Template Injection, and PyObjC Screenshot Tooling to Trace a Two-Line CSS Scrollbar Bug, Burning ~$12 in Tokens While Willison Wasn't LookingSimon Willison · 2026-06-11 · notable
Willison's case study: give Fable 5 a screenshot, walk away, come back to a multi-tool autonomous investigation you never asked for.
Dario Amodei: 'Policy on the AI Exponential' — Anthropic CEO Calls for Mandatory Third-Party Cyber/Bio/Loss-of-Control Testing With Government Authority to Block Frontier Deployments, Plus Wage Insurance, FDA/EMA Acceptance of AI Modeling in Drug Approvals, and a Democratic Semiconductor CoalitionDario Amodei · 2026-06-10 · major
Amodei turns his AI-is-accelerating thesis into five concrete policy asks, including a government kill switch on frontier model releases.
Simon Willison: 'If Claude Fable Stops Helping You, You'll Never Know' — Fable 5 System Card Discloses Silent Prompt Edits, Steering Vectors, and PEFT Patches That Degrade Responses on Frontier-LLM Engineering Without Telling the User or Falling Back to a Different ModelSimon Willison · 2026-06-10 · notable
Simon Willison flags Anthropic's first public admission that Fable 5 silently degrades itself for some frontier-AI prompts without telling the user.
Ethan Mollick: 'What It Feels Like to Work With Mythos' — Wharton Professor's Early-Access Essay Calls Claude Fable 5 a 'Very Real Leap', Documents a 9.5-Hour Concord Run That Built Working Data-Analysis Software, and Reframes the User From Wizard to PatronEthan Mollick · 2026-06-09 · notable
Wharton's Ethan Mollick reports Claude Fable 5 outran every model he had tried, sustained a 9.5-hour autonomous build, and changed his metaphor for working with AI.
Ed Zitron: 'AI Is Slowing Down' — Where's Your Ed At Long-Read Pegs OpenAI Compute Commitments at $770B and Anthropic's at $330B Against ~$60B Combined 2026 Revenue, Calling for ~496% Revenue CAGR Through 2029 to Service the BuildoutEd Zitron · 2026-06-08 · major
Zitron's accounting argues OpenAI and Anthropic need ~496% revenue growth by 2029 to service ~$1.1T in compute commitments.
Simon Willison Ships micropython-wasm 0.1a2 — Runs Untrusted Python Inside a WASI MicroPython Sandbox With wasmtime Memory Caps, CPU 'Fuel' Limits, and Persistent Sessions for LLM Agent Tool UseSimon Willison · 2026-06-06 · notable
A WASI MicroPython sandbox so LLM agents can run untrusted Python with hard memory and CPU limits.
Anthropic Institute's 'When AI Builds Itself' — Marina Favaro and Jack Clark Document 8× Engineer Output, 80% Claude-Authored Code, and Three Recursive Self-Improvement Scenarios Anthropic Says Could Land Before Society Is ReadyThe Anthropic Institute · 2026-06-04 · major
Anthropic argues AI is already automating its own development cycle, and full recursive self-improvement may arrive before any verification regime exists.
Simon Willison on Anthropic's Containment Architecture for Claude — gVisor for Claude.ai, Seatbelt and Bubblewrap for Claude Code, Full VMs for CoworkSimon Willison · 2026-05-30 · major
Simon Willison breaks down Anthropic's three-tier sandbox stack for Claude.ai, Claude Code, and Claude Cowork, including a red-team exfiltration story.
Simon Willison: SQLite Hardens 'Does Not Accept Agentic Code' Policy and Splits AI Bug Reports Into Its Own ForumSimon Willison · 2026-05-27 · notable
A snapshot of how one of the most-used codebases on earth is hardening its rules against AI-written contributions.
Simon Willison: I Think Anthropic and OpenAI Have Found Product-Market FitSimon Willison · 2026-05-27 · notable
The case that AI's business model finally works — built on enterprise coding agents, not consumer chat subscriptions.
Simon Willison — The Last Six Months in LLMs, in Five MinutesSimon Willison · 2026-05-19 · notable
A five-minute tour of what changed in large language models between late 2025 and May 2026.
Simon Willison — Using LLM in the Shebang Line of a ScriptSimon Willison · 2026-05-11 · notable
Simon Willison turns a one-line English description into an executable LLM script via the Unix shebang line.
Daniel Stenberg: Mythos Finds a Curl Vulnerability — One Real Low-Severity Bug, Three False Positives, and a Reality Check on AI Vuln HypeDaniel Stenberg · 2026-05-11 · major
Anthropic's vaunted security model finds one real curl bug, three already-documented behaviors, and a non-vuln — Stenberg's take on what AI scanners actually do today.
Ben Thompson: The Inference Shift — Why Agentic Inference Will Favor Memory Over SpeedStratechery · 2026-05-11 · notable
An essay arguing the next phase of AI compute splits in two: speed-bound 'answer inference' for humans, capacity-bound 'agentic inference' for everything else.
Nathan Lambert: Notes From Inside China's AI LabsInterconnects AI · 2026-05-07 · notable
Lambert's on-the-ground report after touring DeepSeek, Moonshot, Qwen and other Chinese labs — the constraints that turn into competitive advantages.
Running Codex Safely at OpenAI — Sandbox, Approval Policy, Auto-Review, and Agent-Native TelemetryOpenAI · 2026-05-08 · notable
OpenAI's Security team writes up the controls and audit trail it uses to govern Codex when the agent acts on real workflows.
Simon Willison: Notes on the xAI/Anthropic Data Center DealSimon Willison · 2026-05-07 · notable
Simon Willison reads the small print on Anthropic's new SpaceX/xAI Colossus 1 deal and finds three load-bearing risks.
Jeff Kaufman: AI Is Breaking Two Vulnerability CulturesJeff Kaufman · 2026-05-08 · major
AI scanners ended both 90-day embargoes and Linux's 'fix it quietly' culture in the same week — Jeff Kaufman maps what comes next.
Tim Gowers: A Recent Experience With ChatGPT 5.5 Pro — Fields Medalist Watches GPT Solve Open Number-Theory Problems Polynomially in Two HoursTimothy Gowers · 2026-05-08 · major
A Fields Medallist hands an open math problem to ChatGPT 5.5 Pro and gets a polynomial bound back in two hours, with what he calls a completely original argument.
Teaching Claude Why — Anthropic Cuts Agentic-Misalignment Rates From 96% to ~0% by Training on Principles, Not DemonstrationsAnthropic · 2026-05-08 · major
Anthropic's safety team shows that explaining the why beats showing the what when training Claude to refuse blackmail-style behaviors.
Simon Willison: The Unreasonable Effectiveness of HTML Output From Claude CodeSimon Willison · 2026-05-08 · notable
With reasoning-tier models, Markdown is leaving capability on the table — ask for HTML and the same prompt produces something you can actually click through.
Sander Dieleman: Learning the Integral of a Diffusion ModelSander Dieleman · 2026-05-06 · notable
A taxonomy of flow maps — the family of methods replacing iterative diffusion sampling with single-jump prediction.
Simon Willison: Vibe Coding and Agentic Engineering Are Getting Closer Than I'd LikeSimon Willison · 2026-05-06 · notable
As AI agents get more reliable, even careful engineers are skipping code review.
Latent Space: Doing Vibe Physics — GPT-5.2 Solves Year-Long Gluon Problem in 11 MinutesLatent Space · 2026-05-05 · major
GPT-5.2 solved a year-long physics problem in 11 minutes, then wrote 110 pages of original quantum gravity research.
Latent Space: Shopify's AI Phase Transition — Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGymLatent Space · 2026-04-22 · major
Shopify's CTO reveals how three internal AI systems compound to replace thousands of engineers.
Latent Space: Physical AI That Moves the World — Applied Intuition on $15B Autonomy StackLatent Space · 2026-04-27 · notable
Applied Intuition built a $15B company by being the 'Android' of autonomous machines.
Latent Space: Extreme Harness Engineering — 1M LOC, 1B Tokens/Day, 0% Human Code, 0% Human ReviewLatent Space · 2026-04-07 · major
OpenAI built a 1M-line product with zero human code — here's how the harness works.
Eugene Yan: How to Work and Compound with AIEugene Yan · 2026-05-03 · notable
A practical framework for compounding your output with AI — from an ML engineer who lives it.
Sebastian Raschka: My Workflow for Understanding LLM ArchitecturesSebastian Raschka · 2026-04-18 · notable
How Sebastian Raschka actually learns new LLM architectures — the config-first method.
Nathan Lambert: Reading Today's Open-Closed Performance GapNathan Lambert (Interconnects) · 2026-04-20 · notable
The 'open vs. closed gap' can't be read from benchmarks alone — here's what actually matters.
Nathan Lambert: The Inevitable Need for an Open Model ConsortiumNathan Lambert (Interconnects) · 2026-04-11 · notable
Open-source AI needs a Linux Foundation moment — no single company can fund it alone.
Nathan Lambert: Claude Mythos and Misguided Open-Weight FearmongeringNathan Lambert (Interconnects) · 2026-04-09 · notable
The fear around releasing Claude Mythos as open weights conflates too many unknowns into one blunt policy.
Robert Glaser: When Everyone Has AI and the Company Still Learns NothingRobert Glaser · 2026-05-05 · notable
Counting tokens isn't learning — Glaser sketches a 'Loop Intelligence Hub' to surface what AI actually changes inside an org.
Ibrahim Diallo: 'AI Didn't Delete Your Database, You Did' — On Vibe-Coded Infra and the PocketOS WipeIbrahim Diallo · 2026-05-04 · notable
If your AI assistant can drop the production database, the bug is your architecture, not the model.
Drew Breunig: 10 Lessons for Agentic Coding — What Should We Do When Code Is Cheap?Drew Breunig · 2026-05-04 · notable
If your agents can write any code on demand, what habits actually matter? Drew Breunig's 10-item field guide.
Nathan Lambert: 'The Distillation Panic' — Why the New U.S. Crackdown on Distillation Risks Killing Open ResearchInterconnects AI · 2026-05-04 · notable
Lambert says lumping API jailbreaking together with legitimate distillation will hurt U.S. open labs more than it slows China.
Addy Osmani's Agent Skills — Senior-Engineering Workflow Scaffolding for AI Coding AgentsAddy Osmani · 2026-05-03 · major
20 opinionated skills that force coding agents through spec, plan, build, test, review and ship phases.
OpenAI Engineering: Inside the WebRTC Stack Rebuild That Keeps Voice AI Low-Latency at ScaleOpenAI · 2026-05-04 · major
OpenAI's engineering write-up on the WebRTC stack rebuild that keeps Realtime API voice traffic under conversational latency at scale.
OpenAI: Where the Goblins Came From — How the 'Nerdy' Persona Made ChatGPT Obsess Over Little CrittersOpenAI · 2026-04-29 · major
An OpenAI post-mortem on how a single biased reward signal in 'Nerdy' personality training gave ChatGPT a six-month goblin obsession.
Simon Willison: The Zig Project's Rationale for Their Firm Anti-AI Contribution PolicySimon Willison · 2026-04-30 · notable
Why Zig is one of four major projects (with NetBSD, GIMP, qemu) banning AI-generated patches outright.
Simon Willison: LLM 0.32a0 Is a Major Backwards-Compatible RefactorSimon Willison · 2026-04-29 · notable
The most-used Python CLI for LLMs gets a structural rewrite — messages-in, typed-streamed-parts-out — without breaking the old API.
Project Deal — Anthropic's Real-Money Agent Commerce Experiment: 186 Trades, $4k, Model Quality Determines OutcomeAnthropic · 2026-04-24 · notable
Anthropic ran a real Craigslist-style marketplace where Claude agents handled all buying and selling — and better models quietly got better deals.
Anthropic Explains Three Bugs Behind Claude Code's March–April Quality DropAnthropic · 2026-04-23 · major
Anthropic diagnosed the Claude Code regression — three separate engineering mistakes compounded over six weeks, now fixed.
The West Forgot How to Make Things. Now It's Forgetting How to CodeFrom the Trenches · 2026-04-21 · major
A 430-point HN essay argues AI coding dependency is quietly eroding the tacit knowledge junior engineers are supposed to absorb.
It's OK to Use Coding Tools to Finish the Projects You Were Never Going to FinishMatthew Brunelle · 2026-04-23 · notable
A viral essay argues AI coding tools are legitimate for 'wish-list' projects — things you wanted to exist but realistically never would have built.
Simon Willison: OpenAI Recommends Treating GPT-5.5 as an Entirely New Model FamilySimon Willison · 2026-04-25 · notable
OpenAI says old prompts may perform worse on GPT-5.5 — here's what to change and a migration tool to help.
Over-Editing in AI Code Models — Why LLMs Change More Code Than They ShouldnreHieW · 2026-04-22 · notable
Frontier models over-edit code by default — changing far more than needed to fix a bug.
Simon Willison: Headless everything for personal AISimon Willison · 2026-04-19 · notable
Agents hate clicking buttons. As personal AI scales, 'headless everything' turns APIs from a liability back into a competitive advantage.
Simon Willison: Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7Simon Willison · 2026-04-16 · notable
A laptop-sized open-weight model drew a better pelican than Anthropic's frontier closed model — Simon Willison's classic SVG benchmark finally breaks the correlation.
Nathan Lambert: My bets on open models, mid-2026Interconnects AI · 2026-04-15 · notable
A post-training lead at Ai2 writes down, in mid-April 2026, exactly where he thinks open and closed models will diverge for the rest of the year.
Andrej Karpathy's LLM Wiki — the 'drop RAG, let the agent maintain a markdown wiki' patternAndrej Karpathy · 2026-04-04 · major
Stop treating LLMs as retrieval-over-raw-docs. Point an agent at a folder of sources and let it build and maintain a living wiki instead.
Simon Willison: Claude Token Counter with Model Comparisons — Opus 4.7 Tokenizer Costs ~40% MoreSimon Willison · 2026-04-20 · notable
Opus 4.7's new tokenizer produces 1.46x more text tokens than 4.6 — same price per token, but effectively ~40% more spend per prompt.

← All releases · Learn AI