🤖 AI Summary
This work addresses the degradation of “ground-truth vulnerabilities” in fuzzing benchmarks due to software evolution. We systematically evaluate and enhance Magma’s forward-porting technique to sustainably reproduce historical CVEs in modern software versions. Conducting the first empirical, four-year portability analysis of 32 CVEs from Magma, we propose a commit-level diagnostic framework and a collaborative rollback strategy, integrating manual auditing, Git-based provenance tracing, incremental patch application, functional reachability verification, and trigger-path analysis. Our approach successfully achieves robust forward-porting for 17 CVEs; for the remaining 15 cases, we perform root-cause analysis and reliably reproduce 9. The study identifies core challenges in vulnerability maintenance—particularly patch interference, semantic drift, and dependency divergence—and provides both methodological guidance and practical infrastructure for building sustainable, verifiable fuzzing benchmarks.
📝 Abstract
Fuzzing is a well-established technique for detecting bugs and vulnerabilities. With the surge of fuzzers and fuzzer platforms being developed such as AFL and OSSFuzz rises the necessity to benchmark these tools' performance. A common problem is that vulnerability benchmarks are based on bugs in old software releases. For this very reason, Magma introduced the notion of forward-porting to reintroduce vulnerable code in current software releases. While their results are promising, the state-of-the-art lacks an update on the maintainability of this approach over time. Indeed, adding the vulnerable code to a recent software version might either break its functionality or make the vulnerable code no longer reachable. We characterise the challenges with forward-porting by reassessing the portability of Magma's CVEs four years after its release and manually reintroducing the vulnerabilities in the current software versions. We find the straightforward process efficient for 17 of the 32 CVEs in our study. We further investigate why a trivial forward-porting process fails in the 15 other CVEs. This involves identifying the commits breaking the forward-porting process and reverting them in addition to the bug fix. While we manage to complete the process for nine of these CVEs, we provide an update on all 15 and explain the challenges we have been confronted with in this process. Thereby, we give the basis for future work towards a sustainable forward-ported fuzzing benchmark.