A Survey of LLM-based Automated Program Repair: Taxonomies, Design Paradigms, and Applications

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically reviews 63 LLM-based automated program repair (APR) systems published between January 2022 and June 2025, addressing three core challenges: semantic correctness verification beyond test suites, large-scale repository-level defect repair, and optimization of LLM inference cost. We propose the first comprehensive taxonomy, categorizing APR designs into four paradigms: fine-tuning, prompt engineering, pipeline-based workflows, and agent frameworks. Quantitative analysis demonstrates how retrieval augmentation and static/dynamic code analysis enhance context quality, while revealing fundamental trade-offs among cost, controllability, and scalability across paradigms. Key insights identify lightweight feedback mechanisms, repository-aware retrieval, and cost-aware planning as critical levers for advancement. The work establishes a theoretical framework and practical roadmap to enhance the reliability, scalability, and real-world applicability of LLM-APR systems.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are reshaping automated program repair (APR). We categorize the recent 63 LLM-based APR systems published from January 2022 to June 2025 into four paradigms, and show how retrieval- or analysis-augmented contexts strengthen any of them. This taxonomy clarifies key trade-offs: fine-tuning delivers strong task alignment at high training cost; prompting enables rapid deployment but is limited by prompt design and context windows; procedural pipelines offer reproducible control with moderate overhead; agentic frameworks tackle multi-hunk or cross-file bugs at the price of increased latency and complexity. Persistent challenges include verifying semantic correctness beyond test suites, repairing repository-scale defects, and lowering the costs of LLMs. We outline research directions that combine lightweight human feedback, repository-aware retrieval, code analysis, and cost-aware planning to advance reliable and efficient LLM-based APR.
Problem

Research questions and friction points this paper is trying to address.

Categorizing LLM-based APR systems into four paradigms
Addressing challenges in verifying semantic correctness beyond test suites
Advancing reliable and efficient LLM-based APR with lightweight human feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based APR systems categorized into four paradigms
Retrieval or analysis-augmented contexts enhance paradigms
Lightweight human feedback and repository-aware retrieval
🔎 Similar Papers
No similar papers found.
Boyang Yang
Boyang Yang
Jisuan Institute of Technology, Beijing JudaoYouda Network Technology Co. Ltd.
AI4SECS EducationLLM4SEProgram RepairSoftware Engineering
Z
Zijian Cai
Yanshan University, China
F
Fengling Liu
China University of Mining and Technology, China
B
Bach Le
School of Computing and Information Systems, University of Melbourne, Australia
L
Lingming Zhang
Department of Computer Science, University of Illinois Urbana-Champaign, USA
Tegawendé F. Bissyandé
Tegawendé F. Bissyandé
Chief Scientist II / ERC Fellow / TruX @SnT, University of Luxembourg
Software SecurityProgram RepairCode SearchMachine LearningBig Code
Y
Yang Liu
School of Computer Science and Engineering, Nanyang Technological University, Singapore
Haoye Tian
Haoye Tian
Assistant Professor, Aalto University
Software EngineeringMachine LearningProgram RepairAI4SELLM4SE