Dynamic Cogeneration of Bug Reproduction Test in Agentic Program Repair

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing automated program repair approaches typically decouple the generation of patches from that of bug-reproducing tests (BRTs), failing to meet developers’ need for simultaneously producing reliable patches alongside corresponding BRTs and incurring high maintenance overhead across multiple pipelines. To address this, this work proposes a dynamic co-generation framework that, for the first time in agent-driven repair, synchronously generates both patches and BRTs. It further introduces a patch selection mechanism that integrates test-change information. Leveraging large language model–based agents augmented with root-cause analysis of failed execution traces, the approach is evaluated on 120 real-world bugs from Google. Results show that it maintains repair effectiveness while generating high-quality BRTs for an equivalent number of bugs, substantially reducing engineering overhead.

Technology Category

Application Category

📝 Abstract

Bug Reproduction Tests (BRTs) have been used in many agentic Automated Program Repair (APR) systems, primarily for validating promising fixes and aiding fix generation. In practice, when developers submit a patch, they often implement the BRT alongside the fix. Our experience deploying agentic APR reveals that developers similarly desire a BRT within AI-generated patches to increase their confidence. However, canonical APR systems tend to generate BRTs and fixes separately, or focus on producing only the fix in the final patch. In this paper, we study agentic APR in the context of cogeneration, where the APR agent is instructed to generate both a fix and a BRT in the same patch. We evaluate the effectiveness of different cogeneration strategies on 120 human-reported bugs at Google and characterize different cogeneration strategies by their influence on APR agent behavior. We develop and evaluate patch selectors that account for test change information to select patches with plausible fixes (and plausible BRTs). Finally, we analyze the root causes of failed cogeneration trajectories. Importantly, we show that cogeneration allows the APR agent to generate BRTs for at least as many bugs as a dedicated BRT agent, without compromising the generation rate of plausible fixes, thereby reducing engineering effort in maintaining and coordinating separate generation pipelines for fix and BRT at scale.

Problem

Research questions and friction points this paper is trying to address.

Bug Reproduction Test

Automated Program Repair

Cogeneration

Agentic APR

Patch Validation

Innovation

Methods, ideas, or system contributions that make the work stand out.

cogeneration

Bug Reproduction Test

Agentic Program Repair