🤖 AI Summary
Existing language-model-based agents (e.g., SWE-agent, OpenHands) are constrained to Python and the synthetic SWE-bench environment, limiting generalization to real-world, multilingual codebases.
Method: We propose the first cross-lingual, production-oriented automated bug-fixing framework. It constructs a unified, typed knowledge graph integrating source code, abstract syntax trees (ASTs), and natural language—featuring five generic edge types—and implements structured retrieval and reasoning atop Neo4j. A multi-agent architecture orchestrates DeepSeek-V3 for decision-making.
Contribution/Results: Evaluated across seven programming languages, our framework solves 10 previously uncovered, real-world GitHub issues (e.g., in LangChain and OpenHands), achieving 28.67% and 13.7% pass rates on SWE-bench Lite and Multilingual benchmarks, respectively. Average API cost per issue is $0.23 (Lite) and $0.38 (Multilingual). The implementation is open-sourced and production-ready.
📝 Abstract
Language model (LM) agents, such as SWE-agent and OpenHands, have made progress toward automated issue resolution. However, existing approaches are often limited to Python-only issues and rely on pre-constructed containers in SWE-bench with reproduced issues, restricting their applicability to real-world and work for multi-language repositories. We present Prometheus, designed to resolve real-world issues beyond benchmark settings. Prometheus is a multi-agent system that transforms an entire code repository into a unified knowledge graph to guide context retrieval for issue resolution. Prometheus encodes files, abstract syntax trees, and natural language text into a graph of typed nodes and five general edge types to support multiple programming languages. Prometheus uses Neo4j for graph persistence, enabling scalable and structured reasoning over large codebases. Integrated by the DeepSeek-V3 model, Prometheus resolves 28.67% and 13.7% of issues on SWE-bench Lite and SWE-bench Multilingual, respectively, with an average API cost of $0.23 and $0.38 per issue. Prometheus resolves 10 unique issues not addressed by prior work and is the first to demonstrate effectiveness across seven programming languages. Moreover, it shows the ability to resolve real-world GitHub issues in the LangChain and OpenHands repositories. We have open-sourced Prometheus at: https://github.com/Pantheon-temple/Prometheus