Live-SWE-agent | The First Live AI Software Agent

Live-SWE-agent is the first live, runtime self-evolving software engineering agent that expands and revises its own capabilities on the fly while working on a real-world issue. Our key insight is that software agents are themselves software systems, and modern LLM-based agents already possess the intrinsic capability to extend or modify their own behavior at runtime.

📣 News

[Nov 24th, 2025]: Claude Opus 4.5 + Live-SWE-agent scores 79.2% on SWE-bench Verified, leading all current open-source scaffolds and coming very close to Anthropic’s internal, manually engineered scaffold for Opus 4.5!!
[Nov 20th, 2025]: Gemini 3 Pro + Live-SWE-agent scores 77.4% on SWE-bench Verified, outperforming all available models (including Claude 4.5)!
[Nov 17th, 2025]: Live-SWE-agent achieves the new state-of-the-art solve rate of 45.8% on SWE-Bench Pro!
[Nov 17th, 2025]: We've released Live-SWE-agent 1.0.0!

🏆 Leaderboard

For software tasks, recent LLMs are often benchmarked using manually engineered, proprietary agent scaffolds, which makes it difficult to compare the true capabilities of different models fairly.

Live-SWE-agent not only demonstrates that a minimal, open, and live scaffold already has the ability to outperform proprietary scaffolds, but also offers a unified and powerful platform that enables genuinely fair, apples-to-apples comparisons for future model releases.

As shown below, on our leaderboard of recent models (all evaluated with Live-SWE-agent), Claude Opus 4.5 retains the #1 spot with a score of 79.2% on SWE-bench Verified by a large margin.

More model scores are coming soon! For more details, please visit our leaderboard. Feel free to submit your model's evaluation results to help build a more comprehensive and fair benchmarking platform!

📊 Comparison

Below shows the comparison graph between Live-SWE-agent and state-of-the-art open-source solutions and proprietary commercial agent scaffolds on SWE-bench Verified and SWE-Bench Pro.

🚀 Setup

We built Live-SWE-agent on top of the popular mini-swe-agent framework with very minimal modifications.

To use Live-SWE-agent, simply install mini-swe-agent first using this guide and use the custom Live-SWE-agent config:

mini --config config/livesweagent.yaml # using custom Live-SWE-agent config

See the config folder for more details.

⚙️ Artifacts

You can download the complete trajectories, patches, and results of Live-SWE-agent in our v1.0.0 release:

swebench_verified: complete runs on SWE-bench Verified
swebench_pro: complete runs on SWE-Bench Pro

You also obtain them in our 🤗 huggingface datasets

📜 Attribution

@article{livesweagent,
  author    = {Xia, Chunqiu Steven and Wang, Zhe and Yang, Yan and Wei, Yuxiang and Zhang, Lingming},
  title     = {Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?},
  year      = {2025},
  journal   = {arXiv preprint},
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
config		config
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Live-SWE-agent | The First Live AI Software Agent

📣 News

🏆 Leaderboard

📊 Comparison

🚀 Setup

⚙️ Artifacts

📜 Attribution

🙏 Acknowledgements

About

Uh oh!

Releases 1

Packages

Contributors 2

License

OpenAutoCoder/live-swe-agent

Folders and files

Latest commit

History

Repository files navigation

Live-SWE-agent | The First Live AI Software Agent

📣 News

🏆 Leaderboard

📊 Comparison

🚀 Setup

⚙️ Artifacts

📜 Attribution

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Packages