🤖 AI Summary
GP-UCB is a theoretically grounded Bayesian optimization algorithm, but its confidence parameter β grows with iteration count T (e.g., as O(log T)), leading to excessive exploration and slow convergence in practice. This work proposes IRGP-UCB—a novel variant of GP-UCB that samples the confidence parameter stochastically from a shifted exponential distribution. It provides the first rigorous analysis of expected and conditional expected regret for randomized GP-UCB. On finite input domains, IRGP-UCB achieves a time-invariant confidence parameter design, yielding a sublinear regret upper bound of O(√Tγ_T), where γ_T denotes the maximum information gain. Crucially, the theoretical analysis circumvents cumulative amplification of β. Empirical evaluation on synthetic functions, standard benchmarks, and real-world simulators demonstrates consistent and significant improvements over standard GP-UCB.
📝 Abstract
Gaussian process upper confidence bound (GP-UCB) is a theoretically established algorithm for Bayesian optimization (BO), where we assume the objective function $f$ follows GP. One notable drawback of GP-UCB is that the theoretical confidence parameter $eta$ increased along with the iterations is too large. To alleviate this drawback, this paper analyzes the randomized variant of GP-UCB called improved randomized GP-UCB (IRGP-UCB), which uses the confidence parameter generated from the shifted exponential distribution. We analyze the expected regret and conditional expected regret, where the expectation and the probability are taken respectively with $f$ and noises and with the randomness of the BO algorithm. In both regret analyses, IRGP-UCB achieves a sub-linear regret upper bound without increasing the confidence parameter if the input domain is finite. Finally, we show numerical experiments using synthetic and benchmark functions and real-world emulators.