GPT-5 on SWE-bench: Cost & performance deep-dive
This blog post covers the results of running mini-SWE-agent with GPT-5, GPT-5-mini, and GPT-5-nano. Results will be added to the SWE-bench (bash-only) leaderboard shortly.
GPT-5 is as good as Sonnet 4, but quite a bit cheaper
- GPT-5 is as good as Sonnet 4, but quite a bit cheaper
- For sacrificing only a little bit of performance (5%pt), GPT-5-mini is incredibly cheap
GPT-5-nano
is even cheaper, I would say you pay half for half the performance- You can reproduce our numbers for just $18 (with GPT-5-mini) using the command at the bottom!