๐ค AI Summary
Scientific software selection frequently suffers from non-reproducible benchmarks due to multi-library, multi-metric evaluation and dynamic evolutionโsuch as the introduction of new algorithms or modifications to test cases and evaluation criteria. This paper addresses numerical integration over arbitrary 2D/3D domains with implicit or parameterized boundaries (cut-cell quadrature), proposing the first automated benchmarking framework that systematically integrates CI/CD engineering practices into scientific computing workflows. The framework unifies GitHub Actions, Docker, Python-based scheduling, Jupyter-based report generation, and semantically versioned result archiving. It supports automated configuration, execution, visualization, and historical result comparison. It achieves >90% automation for benchmark tasks and regression detection; reduces integration time for new libraries or algorithms by 70%; and enables precise attribution of performance deviations to specific code commits. The framework significantly enhances reliability, reproducibility, and evolutionary adaptability in scientific software evaluation.
๐ Abstract
Scientific software often offers numerous (open or closed-source) alternatives for a given problem. A user needs to make an informed choice by selecting the best option based on specific metrics. However, setting up benchmarks ad-hoc can become overwhelming as the parameter space expands rapidly. Very often, the design of the benchmark is also not fully set at the start of some project. For instance, adding new libraries, adapting metrics, or introducing new benchmark cases during the project can significantly increase complexity and necessitate laborious re-evaluation of previous results. This paper presents a proven approach that utilizes established Continuous Integration tools and practices to achieve high automation of benchmark execution and reporting. Our use case is the numerical integration (quadrature) on arbitrary domains, which are bounded by implicitly or parametrically defined curves or surfaces in 2D or 3D.