Do Autonomous Agents Contribute Test Code? A Study of Tests in Agentic Pull Requests

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This study investigates the behavior of autonomous agents in submitting test code during software development and its impact on software correctness and maintainability. Leveraging the AIDev dataset and combining software repository mining with pull request (PR) lifecycle tracking, this work presents the first systematic quantitative analysis of the frequency and timing of test code inclusion in agent-generated PRs. It further compares PRs with and without tests in terms of size, processing duration, and merge rate. The findings reveal that test-inclusive PRs have become increasingly common over time, are larger in scale, and require longer processing times, yet achieve comparable merge rates. Significant variation exists across agents in their testing adoption strategies and ratios of test to production code, highlighting diverse and evolving testing behaviors among autonomous developers.

Technology Category

Application Category

📝 Abstract

Testing is a critical practice for ensuring software correctness and long-term maintainability. As agentic coding tools increasingly submit pull requests (PRs), it becomes essential to understand how testing appears in these agent-driven workflows. Using the AIDev dataset, we present an empirical study of test inclusion in agentic pull requests. We examine how often tests are included, when they are introduced during the PR lifecycle and how test-containing PRs differ from non-test PRs in terms of size, turnaround time, and merge outcomes. Across agents, test-containing PRs are more common over time and tend to be larger and take longer to complete, while merge rates remain largely similar. We also observe variation across agents in both test adoption and the balance between test and production code within test PRs. Our findings provide a descriptive view of testing behavior in agentic pull requests and offer empirical grounding for future studies of autonomous software development.

Problem

Research questions and friction points this paper is trying to address.

autonomous agents

software testing

pull requests

test code

agentic development

Innovation

Methods, ideas, or system contributions that make the work stand out.

autonomous agents

software testing

pull requests