Calibrated Uncertainty Sampling for Active Learning

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This paper addresses the degradation of generalization performance and predictive reliability in active learning due to insufficient model uncertainty calibration. We propose the first active learning strategy that explicitly models calibration error as a core criterion for sample selection. Methodologically, we introduce a kernel-based calibration error estimator into the sampling criterion—prioritizing labeling of unlabeled samples with the highest estimated calibration error. Under the covariate shift assumption, we theoretically establish that this estimator simultaneously bounds calibration error on both the unlabeled pool and the test set. Experiments across diverse pool-based active learning settings demonstrate that our approach significantly reduces both classification error rate and expected calibration error (ECE), consistently outperforming state-of-the-art baselines. Our key contributions are threefold: (i) the first principled integration of calibration-awareness into the active learning objective; (ii) theoretical guarantees on calibration error control under realistic distributional assumptions; and (iii) an efficient, practical implementation grounded in nonparametric estimation.

Technology Category

Application Category

📝 Abstract

We study the problem of actively learning a classifier with a low calibration error. One of the most popular Acquisition Functions (AFs) in pool-based Active Learning (AL) is querying by the model's uncertainty. However, we recognize that an uncalibrated uncertainty model on the unlabeled pool may significantly affect the AF effectiveness, leading to sub-optimal generalization and high calibration error on unseen data. Deep Neural Networks (DNNs) make it even worse as the model uncertainty from DNN is usually uncalibrated. Therefore, we propose a new AF by estimating calibration errors and query samples with the highest calibration error before leveraging DNN uncertainty. Specifically, we utilize a kernel calibration error estimator under the covariate shift and formally show that AL with this AF eventually leads to a bounded calibration error on the unlabeled pool and unseen test data. Empirically, our proposed method surpasses other AF baselines by having a lower calibration and generalization error across pool-based AL settings.

Problem

Research questions and friction points this paper is trying to address.

Active learning for classifiers with low calibration error

Addressing uncalibrated uncertainty in deep neural networks

Proposing calibrated uncertainty sampling for better generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Estimates calibration errors for active learning

Queries samples with highest calibration error

Uses kernel calibration under covariate shift

🔎 Similar Papers

Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training