🤖 AI Summary
This study addresses the NP-hard problem of selecting low-discrepancy subsets from large-scale sets, which has significant applications in quasi-Monte Carlo methods, machine learning, and computer graphics. We establish for the first time the NP-hardness of this problem under kernel discrepancy measures and propose a novel framework based on Bayesian optimization. By constructing a surrogate model using deep embedded kernels, our approach efficiently searches for optimal subsets, overcoming the computational bottlenecks inherent in traditional combinatorial optimization. Extensive experiments demonstrate that the method substantially reduces subset discrepancy across multiple discrepancy metrics, highlighting its effectiveness and versatility in low-discrepancy design tasks.
📝 Abstract
Low-discrepancy designs play a central role in quasi-Monte Carlo methods and are increasingly influential in other domains such as machine learning, robotics and computer graphics, to name a few. In recent years, one such low-discrepancy construction method called subset selection has received a lot of attention. Given a large population, one optimally selects a small low-discrepancy subset with respect to a discrepancy-based objective. Versions of this problem are known to be NP-hard. In this text, we establish, for the first time, that the subset selection problem with respect to kernel discrepancies is also NP-hard. Motivated by this intractability, we propose a Bayesian Optimization procedure for the subset selection problem utilizing the recent notion of deep embedding kernels. We demonstrate the performance of the BO algorithm to minimize discrepancy measures and note that the framework is broadly applicable any design criteria.