🤖 AI Summary
Traditional spectrum-based fault localization (SBFL) relies solely on coverage spectra across test cases, ignoring rich runtime, control-flow, and lexical information embedded in program execution traces. This work proposes a lightweight context-aware SBFL method that requires no GPU or heavyweight program analysis. By augmenting execution traces with multi-granularity features—including variable values, branch outcomes, and abstract syntax tree (AST) node encodings—and integrating them via machine learning models, the approach enables context-sensitive suspiciousness re-ranking. Its core contribution lies in a low-overhead feature extraction and fusion mechanism that significantly improves localization accuracy without modifying the original spectrum input. Evaluated on QuixBugs and competitive programming benchmarks, the method consistently outperforms classical formulas (e.g., Ochiai), achieving an average 12.7% improvement in Top-1 localization accuracy. It thus offers high precision, minimal computational overhead, and strong practical applicability.
📝 Abstract
Traditional spectrum-based fault localization (SBFL) exploits differences in a program's coverage spectrum when run on passing and failing test cases. However, such runs can provide a wealth of additional information beyond mere coverage. Working with thousands of execution traces of short programs submitted to competitive programming contests and leveraging machine learning and additional runtime, control-flow and lexical features, we present simple ways to improve SBFL. We also propose a simple trick to integrate context information. Our approach outperforms SBFL formulae such as Ochiai on our evaluation set as well as QuixBugs and requires neither a GPU nor any form of advanced program analysis. Existing SBFL solutions could possibly be improved with reasonable effort by adopting some of the proposed ideas.