🤖 AI Summary
This paper addresses the dual challenges of low fault localization accuracy and weak root-cause interpretability in software debugging. To this end, we propose an interpretable diagnosis method based on multi-execution feature fusion. Through empirical analysis of 310 real-world defects, we first establish—systematically and for the first time—that scalar pairs constitute the strongest failure-correlated features. Building upon this insight, we design a joint modeling framework that integrates 17 fine-grained execution features, including variable values, branch conditions, and definition-use chains. We further develop a feature-importance-driven interpretable decision tree model that automatically generates human-readable diagnostic rules. Evaluation across 20 open-source projects demonstrates that our approach significantly improves both fault localization accuracy and root-cause identification depth, substantially reducing developer debugging time. The method achieves a favorable balance between high precision and strong interpretability.
📝 Abstract
Fault localization is a fundamental aspect of debugging, aiming to identify code regions likely responsible for failures. Traditional techniques primarily correlate statement execution with failures, yet program behavior is influenced by diverse execution features-such as variable values, branch conditions, and definition-use pairs-that can provide richer diagnostic insights. In an empirical study of 310 bugs across 20 projects, we analyzed 17 execution features and assessed their correlation with failure outcomes. Our findings suggest that fault localization benefits from a broader range of execution features: (1) Scalar pairs exhibit the strongest correlation with failures; (2) Beyond line executions, def-use pairs and functions executed are key indicators for fault localization; and (3) Combining multiple features enhances effectiveness compared to relying solely on individual features. Building on these insights, we introduce a debugging approach to diagnose failure circumstances. The approach extracts fine-grained execution features and trains a decision tree to differentiate passing and failing runs. From this model, we derive a diagnosis that pinpoints faulty locations and explains the underlying causes of the failure. Our evaluation demonstrates that the generated diagnoses achieve high predictive accuracy, reinforcing their reliability. These interpretable diagnoses empower developers to efficiently debug software by providing deeper insights into failure causes.