🤖 AI Summary
This paper addresses online sequential decision-making under non-stationary environments with ultra-large (continuous or infinite) action spaces. We propose the first Bayesian optimization framework that simultaneously ensures dynamic adaptability and computational efficiency. Methodologically, we introduce a novel integration of Gaussian interpolation with a sliding time window to rapidly model non-stationary continuous reward functions; additionally, we impose Lipschitz continuity constraints to guarantee theoretical tractability. We prove that the algorithm achieves the optimal cumulative regret bound of $O^*(sqrt{T})$, strictly improving upon existing sliding-window Gaussian process approaches. Empirically, our method accelerates computation by two to four orders of magnitude (100×–10,000×) over state-of-the-art baselines, while attaining significantly lower regret and superior real-time performance in dynamic settings.
📝 Abstract
Canonical algorithms for multi-armed bandits typically assume a stationary reward environment where the size of the action space (number of arms) is small. More recently developed methods typically relax only one of these assumptions: existing non-stationary bandit policies are designed for a small number of arms, while Lipschitz, linear, and Gaussian process bandit policies are designed to handle a large (or infinite) number of arms in stationary reward environments under constraints on the reward function. In this manuscript, we propose a novel policy to learn reward environments over a continuous space using Gaussian interpolation. We show that our method efficiently learns continuous Lipschitz reward functions with $mathcal{O}^*(sqrt{T})$ cumulative regret. Furthermore, our method naturally extends to non-stationary problems with a simple modification. We finally demonstrate that our method is computationally favorable (100-10000x faster) and experimentally outperforms sliding Gaussian process policies on datasets with non-stationarity and an extremely large number of arms.