Integrating Various Software Artifacts for Better LLM-based Bug Localization and Program Repair

📅 2024-12-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-based automated program repair (APR) approaches predominantly rely on a single software artifact, neglecting systematic investigation and synergistic modeling of complementary multi-source artifacts—such as bug reports, stack traces, and debugging information. This work presents the first comprehensive analysis of the differential contributions of diverse software artifacts to fault localization and repair. We propose DEVLoRe, an end-to-end APR framework that integrates multi-source information via prompt engineering and context augmentation to jointly perform method-level and line-level fault localization and generate test-suite-passing patches. Evaluated on Defects4J v2.0, DEVLoRe achieves fault localization accuracy of 49.3% for single-fault bugs and 47.6% for multi-fault bugs. It generates plausible, test-suite-passing patches at rates of 56.0% and 14.5%, respectively—marking substantial improvements over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
LLMs have garnered considerable attention for their potential to streamline Automated Program Repair (APR). LLM-based approaches can either insert the correct code or directly generate patches when provided with buggy methods. However, most of LLM-based APR methods rely on a single type of software information, without fully leveraging different software artifacts. Despite this, many LLM-based approaches do not explore which specific types of information best assist in APR. Addressing this gap is crucial for advancing LLM-based APR techniques. We propose DEVLoRe to use issue content (description and message) and stack error traces to localize buggy methods, then rely on debug information in buggy methods and issue content and stack error to localize buggy lines and generate plausible patches which can pass all unit tests. The results show that while issue content is particularly effective in assisting LLMs with fault localization and program repair, different types of software artifacts complement each other. By incorporating different artifacts, DEVLoRe successfully locates 49.3% and 47.6% of single and non-single buggy methods and generates 56.0% and 14.5% plausible patches for the Defects4J v2.0 dataset, respectively. This outperforms current state-of-the-art APR methods. The source code and experimental results of this work for replication are available at https://github.com/XYZboom/DEVLoRe.
Problem

Research questions and friction points this paper is trying to address.

Enhance LLM-based bug localization using multiple software artifacts.
Improve program repair by integrating issue content and stack traces.
Evaluate effectiveness of different software artifacts in APR.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates issue content and stack traces
Uses debug info for precise bug localization
Generates patches passing all unit tests
🔎 Similar Papers
No similar papers found.
Qiong Feng
Qiong Feng
Nanjing University of Science and Technology
Software EngineeringSoftware Architecture
X
Xiaotian Ma
Nanjing University of Science and Technology, China
J
Jiayi Sheng
Nanjing University of Science and Technology, China
Z
Ziyuan Feng
Nanjing University of Science and Technology, China
W
Wei Song
Nanjing University of Science and Technology, China
Peng Liang
Peng Liang
School of Computer Science, Wuhan University
Software EngineeringSoftware ArchitectureEmpirical Software Engineering