🤖 AI Summary
This work addresses the computational inefficiency and approximation errors inherent in traditional entropy search-based Bayesian optimization acquisition functions, which rely on costly Monte Carlo approximations and require custom implementations. The authors propose a two-stage amortized strategy: first, a Prior-data Fitted Network (PFN) is used to model the posterior over the optimal solution; then, an α-PFN is trained to efficiently estimate information gain using this posterior, enabling rapid acquisition evaluation with only a single forward pass. This approach constitutes the first end-to-end learning of entropy search acquisition functions via PFNs, eliminating hand-crafted heuristic approximations. Empirical results demonstrate that the method achieves optimization performance on par with state-of-the-art approaches across both synthetic and real-world benchmarks, while offering over 50× speedup, substantially enhancing computational efficiency and general applicability.
📝 Abstract
Information-theoretic acquisition functions such as Entropy Search (ES) offer a principled exploration-exploitation framework for Bayesian optimization (BO). However, their practical implementation relies on complicated and slow approximations, i.e., a Monte Carlo estimation of the information gain. This complexity can introduce numerical errors and requires specialized, hand-crafted implementations. We propose a two-stage amortization strategy that learns to approximate entropy search-based acquisition functions using Prior-data Fitted Networks (PFNs) in a single forward pass. A first PFN is trained to be conditioned on information about the optima; second, the $α$-PFN is trained to predict the expected information gain by training on information gains measured with the first PFN. The $α$-PFN offers a flexible learned approximation, which replaces the complex heuristic approximations with a single forward pass per candidate, enabling rapid and extensible acquisition evaluation. Empirically, our approach is competitive with state-of-the-art entropy search implementations on synthetic and real-world benchmarks, while accelerating the different entropy search variants across all our experiments, with speed ups over 50x. Source code: https://github.com/automl/AlphaPFN.