Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling

๐Ÿ“… 2025-05-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Bayesian optimization (BO) suffers from high computational overhead and intricate hyperparameter tuning in expensive black-box function optimization, primarily due to repeated surrogate model retraining and acquisition function optimization. Method: This paper introduces zero-shot Bayesian optimizationโ€”a novel paradigm that bypasses surrogate modeling and acquisition optimization entirely. Instead, it directly leverages a pre-trained deep generative foundation model and performs context-based sampling from the posterior distribution of optimal solutions. Contribution/Results: Theoretically equivalent to Thompson sampling, our approach is the first BO method achieving full context dependence and zero hyperparameter adjustment. Empirical evaluation on real-world benchmarks demonstrates over 35ร— wall-clock speedup, while natively supporting efficient parallel and distributed high-throughput optimization.

Technology Category

Application Category

๐Ÿ“ Abstract
The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an acquisition function, which generally require expensive re-training and optimization steps at each iteration, respectively. Although recent work enabled in-context surrogate models that do not require re-training, virtually all existing BO methods still require acquisition function maximization to select the next observation, which introduces many knobs to tune, such as Monte Carlo samplers and multi-start optimizers. In this work, we propose a completely in-context, zero-shot solution for BO that does not require surrogate fitting or acquisition function optimization. This is done by using a pre-trained deep generative model to directly sample from the posterior over the optimum point. We show that this process is equivalent to Thompson sampling and demonstrate the capabilities and cost-effectiveness of our foundation model on a suite of real-world benchmarks. We achieve an efficiency gain of more than 35x in terms of wall-clock time when compared with Gaussian process-based BO, enabling efficient parallel and distributed BO, e.g., for high-throughput optimization.
Problem

Research questions and friction points this paper is trying to address.

Optimizing expensive black-box functions efficiently
Eliminating surrogate model retraining and acquisition optimization
Enabling fast parallel Bayesian optimization via generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context direct optimum sampling for BO
Pre-trained deep generative model usage
Eliminates surrogate fitting and acquisition optimization
๐Ÿ”Ž Similar Papers
No similar papers found.
G
Gustavo Sutter Pessurno de Carvalho
University of Waterloo, Vector Institute
M
Mohammed Abdulrahman
University of Waterloo, Vector Institute
H
Hao Wang
University of Waterloo
Sriram Ganapathi Subramanian
Sriram Ganapathi Subramanian
Carleton University
Reinforcement LearningDeep learningMachine Learning.
M
Marc St-Aubin
BMO, Technology & Operations
S
Sharon O'Sullivan
BMO, Technology & Operations
L
Lawrence Wan
BMO, Technology & Operations
L
Luis A. Ricardez-Sandoval
University of Waterloo
Pascal Poupart
Pascal Poupart
University of Waterloo
Artificial IntelligenceMachine LearningReinforcement LearningFederated LearningNLP
Agustinus Kristiadi
Agustinus Kristiadi
Assistant Professor, Western University
Machine LearningUncertainty QuantificationDecision-MakingAI4Science