Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Bayesian optimization (BO) suffers from high computational overhead and intricate hyperparameter tuning in expensive black-box function optimization, primarily due to repeated surrogate model retraining and acquisition function optimization. Method: This paper introduces zero-shot Bayesian optimization—a novel paradigm that bypasses surrogate modeling and acquisition optimization entirely. Instead, it directly leverages a pre-trained deep generative foundation model and performs context-based sampling from the posterior distribution of optimal solutions. Contribution/Results: Theoretically equivalent to Thompson sampling, our approach is the first BO method achieving full context dependence and zero hyperparameter adjustment. Empirical evaluation on real-world benchmarks demonstrates over 35× wall-clock speedup, while natively supporting efficient parallel and distributed high-throughput optimization.

Technology Category

Application Category

📝 Abstract

The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an acquisition function, which generally require expensive re-training and optimization steps at each iteration, respectively. Although recent work enabled in-context surrogate models that do not require re-training, virtually all existing BO methods still require acquisition function maximization to select the next observation, which introduces many knobs to tune, such as Monte Carlo samplers and multi-start optimizers. In this work, we propose a completely in-context, zero-shot solution for BO that does not require surrogate fitting or acquisition function optimization. This is done by using a pre-trained deep generative model to directly sample from the posterior over the optimum point. We show that this process is equivalent to Thompson sampling and demonstrate the capabilities and cost-effectiveness of our foundation model on a suite of real-world benchmarks. We achieve an efficiency gain of more than 35x in terms of wall-clock time when compared with Gaussian process-based BO, enabling efficient parallel and distributed BO, e.g., for high-throughput optimization.

Problem

Research questions and friction points this paper is trying to address.

Optimizing expensive black-box functions efficiently

Eliminating surrogate model retraining and acquisition optimization

Enabling fast parallel Bayesian optimization via generative models

Innovation

Methods, ideas, or system contributions that make the work stand out.

In-context direct optimum sampling for BO

Pre-trained deep generative model usage

Eliminates surrogate fitting and acquisition optimization

🔎 Similar Papers

No similar papers found.