Detecting the Root Cause Code Lines in Bug-Fixing Commits by Heterogeneous Graph Learning

📅 2025-05-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches struggle to localize root causes of bug-fixing commits in complex software due to their inability to model heterogeneous commit structures and cross-line code dependencies, limiting localization accuracy. To address this, we propose a root-cause localization method based on heterogeneous graph neural networks. Our approach constructs a multi-granularity program dependency heterogeneous graph integrating Abstract Syntax Trees (ASTs), Control Flow Graphs (CFGs), and Data Flow Graphs (DFGs). We design a cross-line semantic preservation mechanism that modulates semantic propagation via a decay-enhancement gating scheme, and introduce a heterogeneous attention-driven semantic aggregation module to explicitly capture fine-grained dependencies among diverse node and edge types. Evaluated on 675 bug-fixing commits across 87 open-source projects, our method achieves an average 73.4% improvement in the MFR metric over state-of-the-art baselines, with a maximum gain of 96.83%, significantly enhancing the precision of identifying critical buggy lines.

Technology Category

Application Category

📝 Abstract
With the continuous growth in the scale and complexity of software systems, defect remediation has become increasingly difficult and costly. Automated defect prediction tools can proactively identify software changes prone to defects within software projects, thereby enhancing software development efficiency. However, existing work in heterogeneous and complex software projects continues to face challenges, such as struggling with heterogeneous commit structures and ignoring cross-line dependencies in code changes, which ultimately reduce the accuracy of defect identification. To address these challenges, we propose an approach called RC_Detector. RC_Detector comprises three main components: the bug-fixing graph construction component, the code semantic aggregation component, and the cross-line semantic retention component. The bug-fixing graph construction component identifies the code syntax structures and program dependencies within bug-fixing commits and transforms them into heterogeneous graph formats by converting the source code into vector representations. The code semantic aggregation component adapts to heterogeneous data by using heterogeneous attention to learn the hidden semantic representation of target code lines. The cross-line semantic retention component regulates propagated semantic information by using attenuation and reinforcement gates derived from old and new code semantic representations, effectively preserving cross-line semantic relationships. Extensive experiments were conducted to evaluate the performance of our model by collecting data from 87 open-source projects, including 675 bug-fixing commits. The experimental results demonstrate that our model outperforms state-of-the-art approaches, achieving significant improvements of 83.15%,96.83%,78.71%,74.15%,54.14%,91.66%,91.66%, and 34.82% in MFR, respectively, compared with the state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

Detecting root cause code lines in bug-fixing commits
Handling heterogeneous commit structures in software projects
Preserving cross-line dependencies for accurate defect identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous graph learning for bug detection
Semantic aggregation with heterogeneous attention
Cross-line semantic retention via attenuation gates
🔎 Similar Papers
No similar papers found.
L
Liguo Ji
Dalian Maritime University, The Dalian Key Laboratory of Artificial Intelligence, China
Shikai Guo
Shikai Guo
Associate Professor, Dalian Maritime University
AI for EDAFPGA Logical SynthesisPlacement & RoutingCompile OptimizationSoftware Engineering
L
Lehuan Zhang
Dalian University of Technology, China
H
Hui Li
Dalian Maritime University, China
Yu Chai
Yu Chai
Shanghai Jiao Tong University, China
R
Rong Chen
Dalian Maritime University, China