Simon Willison · 2026-06-26 · notable
Simon Willison: '2,000 people tried to hack my AI assistant'
Simon Willison covers Fernando Irarrázaval's HackMyClaw challenge, where 2,000 participants sent 6,000 email-based prompt injection attempts at a Claude Opus 4.6 assistant. The $1,000 bounty went unclaimed — no one extracted the protected secret.

A 2,000-person prompt-injection bounty against a Claude Opus 4.6 email assistant ended with the secret still safe.
What is it?
Simon Willison's June 26, 2026 post covers HackMyClaw, a public challenge run by Fernando Irarrázaval that invited 2,000 participants to extract a 'secrets.env' file from a Claude Opus 4.6 email assistant named Fiu. After 6,000 email-based injection attempts and a combined $1,000 bounty from Corgea, Abnormal AI, and an anonymous donor, the challenge ended with no successful exfiltration.
How does it work?
Participants emailed Fiu, an OpenClaw assistant rate-limited to ten messages per hour. Fiu had access to the secret file and explicit instructions never to reveal it. Attackers crafted payloads designed to override that instruction; every attempt that reached the inbox was processed on a schedule, and any successful injection would have leaked the secret in Fiu's reply. Fernando Irarrázaval closed the challenge when infrastructure and model costs grew too high to keep running.
Why does it matter?
HackMyClaw is one of the largest open-bounty tests of a frontier model's prompt-injection resistance, and the zero-success result lines up with what Anthropic has been claiming about Claude Opus 4.6's hardening. Simon Willison cautions that 6,000 failed attempts do not prove future invulnerability — but the data point matters for anyone weighing whether to ship Claude-backed agents into critical workflows.
Try it
https://simonwillison.net/2026/Jun/26/hack-my-ai-assistant/