Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling

📅 2025-07-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM prompting methods struggle to efficiently explore large integration search spaces and lack repository-level contextual understanding, limiting their effectiveness in software engineering problem repair. Method: This paper proposes the first LLM agent framework specifically designed for repository-level problem repair, introducing the agent paradigm to this task. It enables end-to-end integrative reasoning through three coordinated modules—generation, pruning, and selection—and supports dynamic test-time expansion. Contribution/Results: The design overcomes dual limitations of conventional prompting approaches: scalability in search space exploration and capacity for global semantic modeling. Evaluated on the SWE-bench benchmark, our method achieves an average 10.22% improvement in Pass@1 and attains a score of 75.20%, ranking first on the Verified leaderboard. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
Software issue resolution is a critical challenge in software engineering and has garnered increasing attention in recent years. With the rapid advancement of large language models (LLMs), substantial progress has been made in addressing real-world software engineering tasks. Recent studies have introduced ensemble reasoning techniques to enhance the performance of LLM-based issue resolution. However, existing prompting-based methods still face limitations in effectively exploring large ensemble spaces and lack the capacity for repository-level understanding, both of which constrain their overall effectiveness. In this paper, we propose Trae Agent, the first agent-based ensemble reasoning approach for repository-level issue resolution. Trae Agent formulates our goal as an optimal solution search problem and addresses two key challenges, i.e., large ensemble spaces and repository-level understanding, through modular agents for generation, pruning, and selection. We conduct extensive experiments using three leading LLMs on the widely-adopted SWE-bench benchmark, comparing Trae Agent against four state-of-the-art ensemble reasoning techniques. Experimental results demonstrate that Trae Agent consistently achieves superior performance, with an average improvement of 10.22% over all baselines in terms of Pass@1. Trae Agent has achieved first place on the SWE-bench Verified leaderboard, with a notable Pass@1 score of 75.20%. We are pleased to release Trae Agent as an open-source project to support the research community, with all resources available at https://github.com/bytedance/trae-agent.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM-based software issue resolution performance
Addressing large ensemble spaces in reasoning methods
Achieving repository-level understanding in issue resolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based agent for software issue resolution
Modular agents handle generation, pruning, selection
Optimal solution search with ensemble reasoning
🔎 Similar Papers
No similar papers found.