Skynet Report

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice.

Based on the GPT-4 model, it uses a series of LM-centric commands that allow the LM to browse the repository, view, edit and execute code files.

The system has been tested on a benchmark, called SWE-bench, and it was able to resolve 12.29% of issues on the full test set, the best performance to date.

The agents can be run on any GitHub issue.

Skynet Report

SWE-Agent