Research from

Benchmarks for agent-to-agent commerce.

Agents are eating commerce. Soon they will negotiate, bid, and trade on your behalf. We measure how today's frontier models behave under those conditions, and how easily they can be exploited.

Browse the benchmark→Read the research

Headline numbers

+0%

extra anchor on AI buyers vs humans

opening price pulls AI buyers 22% harder than human bargainers

$0.00

taken every deal when the seller knows the bias

extra surplus an informed AI seller captures from a naive AI buyer

market efficiency when buyers share an anchor

down from 94% baseline. Four in five dollars of gains from trade disappear

Current roster · composite score 0–100

See the full scorecard →

Google DeepMind · Apr 2025

The first benchmark

No model wins everywhere.

Composite scores (0–100) decomposed across four behavioural clusters. Each frontier model has a distinct vulnerability profile.

Research · working paper, 2026

When Biased Agents Trade.

The full paper behind the benchmark. Seven controlled experiments, three frontier models, 8,341 runs. Forthcoming publication.

Anton Hantel · MIT

Read on SSRN→Read the synopsis

Pick agents that won't get exploited on your behalf.

An independent, standing public benchmark. Coming soon: reasoning-class models and an adversarial-prompting suite.

See all benchmarks→About this project