Progressive Code Integration for Abstractive Bug Report Summarization

📅 2025-11-29

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Bug reports are unstructured and verbose, and existing summarization methods predominantly rely on superficial textual cues while neglecting critical code snippets—leading to redundant and incomplete summaries. To address this, we propose a generative summarization framework that progressively fuses natural language text with long code fragments. Our approach introduces a stepwise code integration mechanism that circumvents the context-length limitations of large language models (LLMs), enabling joint semantic modeling of textual and code modalities. Furthermore, we incorporate abstractive summarization techniques to enhance both accuracy and completeness in defect understanding. We evaluate our method across four benchmark datasets and eight LLMs; results show improvements of 7.5%–58.2% over extractive baselines and performance competitive with state-of-the-art generative approaches.

Technology Category

Application Category

📝 Abstract

Bug reports are often unstructured and verbose, making it challenging for developers to efficiently comprehend software issues. Existing summarization approaches typically rely on surface-level textual cues, resulting in incomplete or redundant summaries, and they frequently ignore associated code snippets, which are essential for accurate defect diagnosis. To address these limitations, we propose a progressive code-integration framework for LLM-based abstractive bug report summarization. Our approach incrementally incorporates long code snippets alongside textual content, overcoming standard LLM context window constraints and producing semantically rich summaries. Evaluated on four benchmark datasets using eight LLMs, our pipeline outperforms extractive baselines by 7.5%-58.2% and achieves performance comparable to state-of-the-art abstractive methods, highlighting the benefits of jointly leveraging textual and code information for enhanced bug comprehension.

Problem

Research questions and friction points this paper is trying to address.

Summarizes verbose bug reports for developers

Integrates code snippets with text for accuracy

Overcomes LLM context limits for rich summaries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive code-integration framework for LLM summarization

Incrementally incorporates long code snippets with text

Overcomes LLM context limits for richer bug summaries

🔎 Similar Papers

No similar papers found.