Skip to content

Benchmarks

GPT-5 on SWE-bench: Cost & performance deep-dive

This blog post covers the results of running mini-SWE-agent with GPT-5, GPT-5-mini, and GPT-5-nano. Results will be added to the SWE-bench (bash-only) leaderboard shortly.

GPT-5 is as good as Sonnet 4, but quite a bit cheaper

  • GPT-5 is as good as Sonnet 4, but quite a bit cheaper
  • For sacrificing only a little bit of performance (5%pt), GPT-5-mini is incredibly cheap
  • GPT-5-nano is even cheaper, I would say you pay half for half the performance
  • You can reproduce our numbers for just $18 (with GPT-5-mini) using the command at the bottom!