Blog
Source: deepswe.datacurve.ai
Published:
<p>DeepSWE is a long-horizon software engineering benchmark that delivers four major advances over today's public benchmarks:</p> <p>Existing benchmarks fall short on several of these axes. SWE-bench Pro , the leading agentic coding benchmark, has tasks averaging just 120 lines of code to solve, and