🤖 AI Summary
In evaluating continuous single-objective stochastic optimization algorithms, determining the optimal number of independent runs poses a fundamental trade-off between statistical reliability and computational efficiency. To address this, we propose a probability-theoretic, adaptive online estimation method. Our approach incorporates a robustness-driven distributional test that dynamically determines the minimum required number of runs per problem instance while guaranteeing statistical significance and eliminating redundant evaluations. Integrated with online learning, the method is algorithm-agnostic—supporting diverse solvers including 104 differential evolution configurations and Nevergrad variants. Validated on the COCO benchmark through 5.748 million experiments, it achieves estimation accuracy of 82%–95%, reduces average run counts by 50%, and substantially enhances experimental efficiency, scalability, and green computing performance.
📝 Abstract
Determining the number of algorithm runs is a critical aspect of experimental design, as it directly influences the experiment's duration and the reliability of its outcomes. This paper introduces an empirical approach to estimating the required number of runs per problem instance for accurate estimation of the performance of the continuous single-objective stochastic optimization algorithm. The method leverages probability theory, incorporating a robustness check to identify significant imbalances in the data distribution relative to the mean, and dynamically adjusts the number of runs during execution as an online approach. The proposed methodology was extensively tested across two algorithm portfolios (104 Differential Evolution configurations and the Nevergrad portfolio) and the COCO benchmark suite, totaling 5748000 runs. The results demonstrate 82% - 95% accuracy in estimations across different algorithms, allowing a reduction of approximately 50% in the number of runs without compromising optimization outcomes. This online calculation of required runs not only improves benchmarking efficiency, but also contributes to energy reduction, fostering a more environmentally sustainable computing ecosystem.