๐ค AI Summary
In Gaussian process regression (GPR), active learning lacks theoretical guarantees on prediction accuracy due to the unknown underlying target distribution.
Method: This paper introduces the first distributionally robust active learning framework for GPR, proposing two sampling strategies that minimize the worst-case expected mean squared error (WCE-MSE). Theoretically, we derive a tight upper bound on WCE-MSE and prove that it can be made arbitrarily small under finite labeled samples. Computationally, we integrate distributionally robust optimization with kernel ridge regression approximation to ensure tractability and scalability.
Results: Extensive experiments on synthetic and real-world datasets demonstrate that our methods significantly outperform classical active learning baselines. Crucially, they achieve substantial improvements in both labeling efficiency and generalization robustnessโeach rigorously supported by theoretical guarantees.
๐ Abstract
Gaussian process regression (GPR) or kernel ridge regression is a widely used and powerful tool for nonlinear prediction. Therefore, active learning (AL) for GPR, which actively collects data labels to achieve an accurate prediction with fewer data labels, is an important problem. However, existing AL methods do not theoretically guarantee prediction accuracy for target distribution. Furthermore, as discussed in the distributionally robust learning literature, specifying the target distribution is often difficult. Thus, this paper proposes two AL methods that effectively reduce the worst-case expected error for GPR, which is the worst-case expectation in target distribution candidates. We show an upper bound of the worst-case expected squared error, which suggests that the error will be arbitrarily small by a finite number of data labels under mild conditions. Finally, we demonstrate the effectiveness of the proposed methods through synthetic and real-world datasets.