🤖 AI Summary
To address the limited performance of large language model (LLM)-based automatic program repair (APR) under data-scarce fine-tuning scenarios, this paper pioneers the application of prompt tuning to APR, proposing a knowledge-guided lightweight prompt optimization method. It systematically models six domain-specific knowledge types—including code semantics and defect patterns—to construct learnable, knowledge-augmented prompt embeddings jointly optimized with the LLM. The method enables efficient knowledge injection across three model scales (e.g., CodeT5) and generalizes across four programming languages and six benchmark datasets. On average, it improves repair accuracy by 87.33% over full-parameter fine-tuning and significantly outperforms existing state-of-the-art methods under extremely low-shot settings. Its core contributions are: (1) establishing the first knowledge-driven prompt tuning paradigm for APR; and (2) achieving highly generalizable, resource-efficient repair capability enhancement.
📝 Abstract
Automated Program Repair (APR) aims to enhance software reliability by automatically generating bug-fixing patches. Recent work has improved the state-of-the-art of APR by fine-tuning pre-trained large language models (LLMs), such as CodeT5, for APR. However, the effectiveness of fine-tuning becomes weakened in data scarcity scenarios, and data scarcity can be a common issue in practice, limiting fine-tuning performance. To alleviate this limitation, this paper adapts prompt tuning for enhanced APR and conducts a comprehensive study to evaluate its effectiveness in data scarcity scenarios, using three LLMs of different sizes and six diverse datasets across four programming languages. Prompt tuning rewrites the input to a model by adding extra prompt tokens and tunes both the model and the prompts on a small dataset. These tokens provide task-specific knowledge that can improve the model for APR, which is especially critical in data scarcity scenarios. Moreover, domain knowledge has proven crucial in many code intelligence tasks, but existing studies fail to leverage domain knowledge during the prompt tuning for APR. To close this gap, we introduce knowledge prompt tuning, an approach that adapts prompt tuning with six distinct types of code- or bug-related domain knowledge for APR. Our work, to the best of our knowledge, is the first to adapt and evaluate prompt tuning and the effectiveness of code- or bug-related domain knowledge for APR, particularly under data scarcity settings. Our evaluation results demonstrate that prompt tuning with knowledge generally outperforms fine-tuning under various experimental settings, achieving an average improvement of 87.33% over fine-tuning in data scarcity scenarios.