Distributionally Robust Active Learning for Gaussian Process Regression

📅 2025-02-24

📈 Citations: 0

✨ Influential: 0

career value

252K/year

🤖 AI Summary

In Gaussian process regression (GPR), active learning lacks theoretical guarantees on prediction accuracy due to the unknown underlying target distribution. Method: This paper introduces the first distributionally robust active learning framework for GPR, proposing two sampling strategies that minimize the worst-case expected mean squared error (WCE-MSE). Theoretically, we derive a tight upper bound on WCE-MSE and prove that it can be made arbitrarily small under finite labeled samples. Computationally, we integrate distributionally robust optimization with kernel ridge regression approximation to ensure tractability and scalability. Results: Extensive experiments on synthetic and real-world datasets demonstrate that our methods significantly outperform classical active learning baselines. Crucially, they achieve substantial improvements in both labeling efficiency and generalization robustness—each rigorously supported by theoretical guarantees.

Technology Category

Application Category

📝 Abstract

Gaussian process regression (GPR) or kernel ridge regression is a widely used and powerful tool for nonlinear prediction. Therefore, active learning (AL) for GPR, which actively collects data labels to achieve an accurate prediction with fewer data labels, is an important problem. However, existing AL methods do not theoretically guarantee prediction accuracy for target distribution. Furthermore, as discussed in the distributionally robust learning literature, specifying the target distribution is often difficult. Thus, this paper proposes two AL methods that effectively reduce the worst-case expected error for GPR, which is the worst-case expectation in target distribution candidates. We show an upper bound of the worst-case expected squared error, which suggests that the error will be arbitrarily small by a finite number of data labels under mild conditions. Finally, we demonstrate the effectiveness of the proposed methods through synthetic and real-world datasets.

Problem

Research questions and friction points this paper is trying to address.

Ensure prediction accuracy in GPR

Reduce worst-case expected error

Effective active learning methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributionally Robust Active Learning

Gaussian Process Regression

Worst-case Expected Error Reduction

🔎 Similar Papers

No similar papers found.