Anthropic · 2026-04-24 · notable

Project Deal — Anthropic's Real-Money Agent Commerce Experiment: 186 Trades, $4k, Model Quality Determines Outcome

Item: Project Deal — Anthropic's Real-Money Agent Commerce Experiment: 186 Trades, $4k, Model Quality Determines Outcome
Rating: 3
Author: AI/TLDR

Anthropic ran a real-money classified marketplace for 69 employees where all trades were executed by Claude agents. 186 deals worth $4,000+ closed. Participants with Claude Opus 4.5 completed 2 more deals and sold identical items for 70% more than those with Haiku 4.5 — and those with weaker agents had no idea.

Anthropic Project Deal — Claude-run marketplace experiment where agents traded real goods on behalf of employees

Anthropic ran a real Craigslist-style marketplace where Claude agents handled all buying and selling — and better models quietly got better deals.

Key specs

Participants	69 employees
Deals closed	186
Total value	$4,000+
Opus vs haiku price gap	~70% (broken bike: $65 vs $38)

What is it?

Project Deal was a one-week experiment Anthropic ran in its San Francisco office in April 2026. Sixty-nine employees each received $100 in gift cards to spend in a Claude-mediated classified marketplace. Every listing, negotiation, and trade was executed autonomously by Claude agents on the sellers' and buyers' behalf. Anthropic ran four parallel versions simultaneously: two with all-Opus-4.5 agents, two with a mixed Opus/Haiku-4.5 split — only one run was 'real' (goods actually exchanged after the experiment).

How does it work?

Agents conducted intake interviews to understand what employees wanted to sell or buy, then acted autonomously. Agents posted listings, received inquiries, negotiated prices, and finalized deals without the human participants intervening. Outcome metrics were compared across model tiers. The results: Opus-backed participants closed roughly 2 more deals on average and fetched significantly higher prices for identical items. One broken bike sold for $65 under Opus versus $38 under Haiku. Critically, participants with weaker models rated deal fairness virtually identically to those with stronger models (4.05 vs 4.06 on a 7-point scale).

Why does it matter?

This is the first controlled experiment showing that AI model quality creates asymmetric outcomes in agent-mediated commerce — and that the disadvantaged party cannot detect the gap. As agents take on more financial tasks (negotiation, purchasing, contract review), the choice of model becomes a form of economic leverage invisible to participants. The finding that tactical prompt instructions had no significant effect on outcomes adds a second layer: capability trumps prompt engineering in real commerce.

Who is it for?

Developers building AI agents for commerce, negotiation, or procurement tasks; product teams designing agent-mediated systems