$α$-PFN: Fast Entropy Search via In-Context Learning

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the computational inefficiency and approximation errors inherent in traditional entropy search-based Bayesian optimization acquisition functions, which rely on costly Monte Carlo approximations and require custom implementations. The authors propose a two-stage amortized strategy: first, a Prior-data Fitted Network (PFN) is used to model the posterior over the optimal solution; then, an α-PFN is trained to efficiently estimate information gain using this posterior, enabling rapid acquisition evaluation with only a single forward pass. This approach constitutes the first end-to-end learning of entropy search acquisition functions via PFNs, eliminating hand-crafted heuristic approximations. Empirical results demonstrate that the method achieves optimization performance on par with state-of-the-art approaches across both synthetic and real-world benchmarks, while offering over 50× speedup, substantially enhancing computational efficiency and general applicability.

📝 Abstract

Information-theoretic acquisition functions such as Entropy Search (ES) offer a principled exploration-exploitation framework for Bayesian optimization (BO). However, their practical implementation relies on complicated and slow approximations, i.e., a Monte Carlo estimation of the information gain. This complexity can introduce numerical errors and requires specialized, hand-crafted implementations. We propose a two-stage amortization strategy that learns to approximate entropy search-based acquisition functions using Prior-data Fitted Networks (PFNs) in a single forward pass. A first PFN is trained to be conditioned on information about the optima; second, the $α$-PFN is trained to predict the expected information gain by training on information gains measured with the first PFN. The $α$-PFN offers a flexible learned approximation, which replaces the complex heuristic approximations with a single forward pass per candidate, enabling rapid and extensible acquisition evaluation. Empirically, our approach is competitive with state-of-the-art entropy search implementations on synthetic and real-world benchmarks, while accelerating the different entropy search variants across all our experiments, with speed ups over 50x. Source code: https://github.com/automl/AlphaPFN.

Problem

Research questions and friction points this paper is trying to address.

Bayesian optimization

Entropy Search

acquisition functions

information gain

computational efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Entropy Search

Prior-data Fitted Networks

Bayesian Optimization