1
gpt-5-chat
3063
2
claude-opus-4-1-20250805
3658
3
claude-sonnet-4-20250514
5472
3
gpt-4.1-2025-04-14
6220
3
claude-opus-4-1-20250805 (Thinking)
3413
3
claude-opus-4-20250514
5090
4
gemini-2.5-pro-preview-06-05
4724
8
claude-opus-4-20250514 (Thinking)
4874
8
claude-sonnet-4-20250514 (Thinking)
5613
8
gemini-2.5-flash-preview-05-20
6887
9
o3-2025-04-16-medium*
7410
12
llama4-maverick-instruct-basic
7445
12
o4-mini-2025-04-16-medium*
7263
|700
* The API for this model does not consistently format its responses in Markdown. These raw responses from the API are used in head-to-head comparisons.