MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning

📅 2024-04-19

🏛️ ACM Transactions on Software Engineering and Methodology

📈 Citations: 12

✨ Influential: 1

career value

185K/year

🤖 AI Summary

Existing code repair fine-tuning methods primarily focus on syntactic patterns while neglecting the logical rationale behind code changes, resulting in poor semantic understanding and high computational overhead. Method: We propose MORepair, a multi-objective fine-tuning framework that— for the first time—explicitly models logical attribution as an independent optimization objective, jointly enhancing large language models’ (LLMs) adaptability to syntactic transformations and their capacity to reason about modification intent. MORepair employs supervised fine-tuning with a dual-objective loss function and constructs logic-annotated samples using function-level and repository-level repair data, supporting diverse open-source LLM architectures. Contribution/Results: Experiments demonstrate that MORepair achieves 11.4%–56.0% absolute improvements in repair accuracy across multiple benchmarks, significantly outperforming standard fine-tuning, Fine-tune-CoT, and RepairLLaMA—while requiring substantially lower computational cost and delivering superior semantic-aware repair performance.

Technology Category

Application Category

📝 Abstract

Within the realm of software engineering, specialized tasks on code, such as program repair, present unique challenges, necessitating fine-tuning Large language models (LLMs) to unlock state-of-the-art performance. Fine-tuning approaches proposed in the literature for LLMs on program repair tasks generally overlook the need to reason about the logic behind code changes, beyond syntactic patterns in the data. High-performing fine-tuning experiments also usually come at very high computational costs. With MORepair, we propose a novel perspective on the learning focus of LLM fine-tuning for program repair: we not only adapt the LLM parameters to the syntactic nuances of the task of code transformation (objective ➊), but we also specifically fine-tune the LLM with respect to the logical reason behind the code change in the training data (objective ➋). Such a multi-objective fine-tuning will instruct LLMs to generate high-quality patches. We apply MORepair to fine-tune four open-source LLMs with different sizes and architectures. Experimental results on function-level and repository-level repair benchmarks show that the implemented fine-tuning effectively boosts LLM repair performance by 11.4% to 56.0%. We further show that our fine-tuning strategy yields superior performance compared to the state-of-the-art approaches, including standard fine-tuning, Fine-tune-CoT, and RepairLLaMA.

Problem

Research questions and friction points this paper is trying to address.

Improving LLMs' code repair via multi-objective fine-tuning

Addressing logical reasoning gaps in program repair tasks

Reducing computational costs while enhancing repair performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective fine-tuning for code repair

Combines syntactic and logical reasoning training

Boosts LLM repair performance significantly

🔎 Similar Papers

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair