🤖 AI Summary
In CI regression debugging, verbose build logs severely hinder the efficiency of traditional text-difference algorithms (e.g., LCS) and impose heavy cognitive burdens on manual analysis. This paper proposes CiDiff—the first lightweight, semantic-aware differencing algorithm specifically designed for the structural characteristics of CI build logs. Its core contributions are: (1) syntax-aware log chunking to group semantically coherent lines; (2) build-phase-aware semantic alignment and weighted matching to prioritize critical failures; and (3) an optimized LCS variant that jointly preserves line-order stability and semantic consistency. Evaluated on 17,906 real-world CI regression cases, CiDiff reduces the median number of candidate diff lines by 60%. A user study with professional developers shows that 70% prefer CiDiff for debugging, while only 5% favor conventional LCS-based diff tools—demonstrating its practical efficacy and usability.
📝 Abstract
Continuous integration (CI) is widely used by developers to ensure the quality and reliability of their software projects. However, diagnosing a CI regression is a tedious process that involves the manual analysis of lengthy build logs. In this paper, we explore how textual differencing can support the debugging of CI regressions. As off-the-shelf diff algorithms produce suboptimal results, in this work we introduce a new diff algorithm specifically tailored to build logs called CiDiff. We evaluate CiDiff against several baselines on a novel dataset of 17 906 CI regressions, performing an accuracy study, a quantitative study and a user-study. Notably, our algorithm reduces the number of lines to inspect by about 60 % in the median case, with reasonable overhead compared to the state-of-practice LCS-diff. Finally, our algorithm is preferred by the majority of participants in 70 % of the regression cases, whereas LCS-diff is preferred in only 5 % of the cases.