🤖 AI Summary
In large monorepos, CI merge pipelines under high load become integration bottlenecks, severely impeding development velocity. To address this, we propose a build-system-agnostic optimization framework: leveraging historical build logs, PR metadata, and contextual features, we train a lightweight model to predict per-PR build success probability; based on these predictions, we dynamically prioritize pull requests—scheduling high-probability requests first during peak loads. Our approach requires no modifications to underlying CI infrastructure, ensuring strong integrability and deployment feasibility. Evaluated on a real-world large-scale production monorepo, it achieves significantly higher throughput compared to FIFO and non-learning baselines, effectively alleviating CI integration bottlenecks. The method provides a practical, scalable optimization pathway for high-concurrency software delivery without compromising system compatibility or operational simplicity.
📝 Abstract
Integrating changes into large monolithic software repositories is a critical step in modern software development that substantially impacts the speed of feature delivery, the stability of the codebase, and the overall productivity of development teams. To ensure the stability of the main branch, many organizations use merge pipelines that test software versions before the changes are permanently integrated. However, the load on merge pipelines is often so high that they become bottlenecks, despite the use of parallelization. Existing optimizations frequently rely on specific build systems, limiting their generalizability and applicability. In this paper we propose to optimize the order of PRs in merge pipelines using practical build predictions utilizing only historical build data, PR metadata, and contextual information to estimate the likelihood of successful builds in the merge pipeline. By dynamically prioritizing likely passing PRs during peak hours, this approach maximizes throughput when it matters most. Experiments conducted on a real-world, large-scale project demonstrate that predictive ordering significantly outperforms traditional first-in-first-out (FIFO), as well as non-learning-based ordering strategies. Unlike alternative optimizations, this approach is agnostic to the underlying build system and thus easily integrable into existing automated merge pipelines.