Anthropic · 2026-04-24 · notable
Anthropic's 2026 Midterms Safety Update: Claude Scores 95–96% on Political Neutrality
Ahead of the 2026 US midterms, Anthropic published its election safety results: Claude Opus 4.7 scored 95% on political neutrality and 100% on election policy compliance. First-ever autonomous influence-operation testing shows 94% refusal for Opus 4.7 and 90% for Sonnet 4.6.
Anthropic's pre-midterms safety report: Claude scores 95–96% on political neutrality and refuses 90–94% of autonomous influence-operation tasks.
Key specs
| Opus 4.7 election policy compliance | 100% |
|---|---|
| Sonnet 4.6 election policy compliance | 99.8% |
| Opus 4.7 political neutrality | 95% |
| Sonnet 4.6 political neutrality | 96% |
| Opus 4.7 influence op resistance | 94% |
| Sonnet 4.6 influence op resistance | 90% |
What is it?
On April 24, 2026, Anthropic published an update on its election integrity measures ahead of the 2026 US midterms. The update discloses testing results for the first time: political neutrality evaluations across 600 prompts with Opus 4.7 and Sonnet 4.6 scoring 95% and 96%; election-policy compliance tests showing 100% and 99.8% appropriate responses; and the first-ever test of whether models can autonomously run a multi-step influence operation end-to-end — Opus 4.7 refused 94% and Sonnet 4.6 refused 90% of such attempts. New: TurboVote banners for voter information queries.
How does it work?
Claude's usage policy bans deceptive political campaigns, fake content to influence political discourse, voter fraud, and voting infrastructure interference. Automated classifiers monitor for violations. Web search integration triggers for 92–95% of candidate and voting procedure queries, surfacing nonpartisan resources via TurboVote banners.
Why does it matter?
The key new data point is autonomous influence-operation testing — prior election safety reports from major labs hadn't published results on whether models could plan and execute multi-step disinformation campaigns without human direction. The 90–94% refusal rates give a concrete baseline for this threat surface. For teams deploying Claude in political or civic contexts, this is the clearest public documentation of what behaviors are blocked.
Who is it for?
Teams deploying Claude in civic, media, or political contexts; AI governance researchers