- Published on
Aider, an AI-powered code assistant, achieved a state-of-the-art result of 26.3% on the SWE Bench Lite benchmark, surpassing the previous top leaderboard entry of 20.3% from Amazon Q Developer Agent.
Aider’s success is attributed to its focus on static code analysis, reliable LLM code editing, and pragmatic UX for AI pair programming.
The AI does not use RAG, vector search, tools, or give the LLM access to search the web or unilaterally execute code.
It emphasises being an interactive tool for engineers to get real work done in real code bases using a chat interface.
The benchmark methodology involved running aider in each problem’s git repository, with the problem statement submitted as the opening chat message. The AI scored 25.0% using GPT-4o alone, which was also matching the state-of-the-art before being surpassed by the 26.3% result using both GPT-4o and Opus.