Uncertainty Herding: One Active Learning Method for All Label Budgets

📅 2024-12-30

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing active learning methods exhibit performance imbalance across wide label-budget ranges—from extremely low to relatively high—struggling to simultaneously optimize in both data-scarce and data-abundant regimes, while often relying on predefined budget assumptions or extensive hyperparameter tuning. This work proposes *Uncertainty Coverage*, a unified objective that captures the intrinsic consistency of sample selection under varying budget constraints. Based on this principle, we design *Uncertainty Herding*, a lightweight greedy algorithm requiring no hyperparameter adjustment or prior knowledge of the label budget. Grounded in distribution-level coverage theory and rigorous uncertainty estimation, our approach guarantees robust near-optimality across the entire budget spectrum. Extensive multi-task experiments demonstrate that Uncertainty Herding consistently matches or surpasses state-of-the-art methods—from ultra-low budgets (e.g., 10–50 labels) to high-budget regimes—establishing it as the first active learning method to achieve sustained superiority across a broad budget range.

Technology Category

Application Category

📝 Abstract

Most active learning research has focused on methods which perform well when many labels are available, but can be dramatically worse than random selection when label budgets are small. Other methods have focused on the low-budget regime, but do poorly as label budgets increase. As the line between"low"and"high"budgets varies by problem, this is a serious issue in practice. We propose uncertainty coverage, an objective which generalizes a variety of low- and high-budget objectives, as well as natural, hyperparameter-light methods to smoothly interpolate between low- and high-budget regimes. We call greedy optimization of the estimate Uncertainty Herding; this simple method is computationally fast, and we prove that it nearly optimizes the distribution-level coverage. In experimental validation across a variety of active learning tasks, our proposal matches or beats state-of-the-art performance in essentially all cases; it is the only method of which we are aware that reliably works well in both low- and high-budget settings.

Problem

Research questions and friction points this paper is trying to address.

Active Learning

Label Scarcity

Label Abundance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uncertainty-Driven Approach

Active Learning

Label Quantity Adaptability

🔎 Similar Papers

No similar papers found.