🤖 AI Summary
This paper addresses bilevel optimization problems where both the upper- and lower-level objectives are expensive black-box functions, rendering standard Bayesian optimization (BO) inapplicable due to the nested structure. To tackle this challenge, we propose the first information-theoretic BO framework for bilevel optimization: a unified acquisition function jointly quantifies information gain about both the optimal lower-level solution and the upper-level objective value; we introduce information gain—previously unexplored in bilevel optimization—and derive an analytically tractable lower bound approximation for efficient computation. Our method integrates Gaussian process surrogate modeling with a bilevel-structure-aware acquisition strategy. Empirical evaluation across multiple benchmarks demonstrates substantial reductions in function evaluations and faster convergence compared to existing BO variants. The framework provides a scalable, theory-driven paradigm for expensive bilevel optimization.
📝 Abstract
A bilevel optimization problem consists of two optimization problems nested as an upper- and a lower-level problem, in which the optimality of the lower-level problem defines a constraint for the upper-level problem. This paper considers Bayesian optimization (BO) for the case that both the upper- and lower-levels involve expensive black-box functions. Because of its nested structure, bilevel optimization has a complex problem definition and, compared with other standard extensions of BO such as multi-objective or constraint settings, it has not been widely studied. We propose an information-theoretic approach that considers the information gain of both the upper- and lower-optimal solutions and values. This enables us to define a unified criterion that measures the benefit for both level problems, simultaneously. Further, we also show a practical lower bound based approach to evaluating the information gain. We empirically demonstrate the effectiveness of our proposed method through several benchmark datasets.