Does the model know how many times it's called a tool?

4 days ago 2

The model is asked to call a tool a random number of times (between 1 and 100) and asked to output how many times it called thet tool.

Model Experiments Accuracy Most Common Calls Most Common %

claude-4-sonnet-20250514	10	0%	61	90%
claude-4-opus-20250514	10	100%	42	80%
gpt-4o	10	70%	20	30%
gpt-4o-mini	10	10%	75	10%
gpt-4.1	10	100%	10	100%
gpt-4.1-mini	10	60%	20	30%
o3	10	90%	9	30%
o4-mini	10	50%	3	20%

# Run with default settings (Claude Sonnet 4, 10 experiments) uv run experiment.py # Run with different model uv run experiment.py --model "openai:gpt-4o-mini" # Run with more experiments uv run experiment.py --experiments 50 # Combined options uv run experiment.py --model "openai:gpt-4o" --experiments 25

Read Entire Article

Does the model know how many times it's called a tool?

Related

Mastodon is a digital public good

Things We've Learned from 5 Years of Tailscale

An actual glass react component for Apple's liquid glass