Explainable Fault Localization for Programming Assignments via LLM-Guided Annotation

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing automated fault localization (FL) methods predominantly operate at the method level, lacking fine-grained precision and interpretability—thus failing to meet students’ need for concrete, actionable repair guidance in educational settings. Line-level FL approaches, meanwhile, rely on numeric line-number prediction, which is ill-suited for large language models (LLMs). This paper proposes FLAME: a fine-grained, interpretable FL framework tailored for educational programming assignments. Its core innovation lies in leveraging LLMs to generate natural-language explanations alongside line-level code annotations—replacing opaque line-number predictions—and employing a multi-model weighted ensemble to enhance robustness and accuracy. Evaluated on a programming assignment dataset, FLAME achieves 207 more Top-1 correct localizations than the state-of-the-art. It also consistently outperforms all baselines on the Defects4J benchmark, demonstrating its effectiveness and pedagogical value in both educational and general software engineering contexts.

Technology Category

Application Category

📝 Abstract

Providing timely and personalized guidance for students' programming assignments, offers significant practical value for helping students complete assignments and enhance their learning. In recent years, various automated Fault Localization (FL) techniques have demonstrated promising results in identifying errors in programs. However, existing FL techniques face challenges when applied to educational contexts. Most approaches operate at the method level without explanatory feedback, resulting in granularity too coarse for students who need actionable insights to identify and fix their errors. While some approaches attempt line-level fault localization, they often depend on predicting line numbers directly in numerical form, which is ill-suited to LLMs. To address these challenges, we propose FLAME, a fine-grained, explainable Fault Localization method tailored for programming assignments via LLM-guided Annotation and Model Ensemble. FLAME leverages rich contextual information specific to programming assignments to guide LLMs in identifying faulty code lines. Instead of directly predicting line numbers, we prompt the LLM to annotate faulty code lines with detailed explanations, enhancing both localization accuracy and educational value. To further improve reliability, we introduce a weighted multi-model voting strategy that aggregates results from multiple LLMs to determine the suspiciousness of each code line. Extensive experimental results demonstrate that FLAME outperforms state-of-the-art fault localization baselines on programming assignments, successfully localizing 207 more faults at top-1 over the best-performing baseline. Beyond educational contexts, FLAME also generalizes effectively to general-purpose software codebases, outperforming all baselines on the Defects4J benchmark.

Problem

Research questions and friction points this paper is trying to address.

Identifies faulty code lines in student programs with explanations

Improves granularity and accuracy of automated fault localization

Uses ensemble LLM approach for reliable educational feedback

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided annotation for faulty code lines

Weighted multi-model voting strategy for reliability

Fine-grained explainable fault localization method

🔎 Similar Papers

No similar papers found.

Authors to Follow