Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels

📅 2025-12-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high label noise in active learning caused by annotator ability disparities—particularly erroneous labeling of complex instances—this paper proposes a robust, noise-resilient active learning framework. Methodologically: (1) it formulates an optimal annotator allocation model grounded in game theory, minimizing the worst-case potential noise per iteration; (2) it introduces an uncertainty-aware, noise-robust sampling strategy; and (3) it integrates multi-annotator confidence-weighted ensemble learning with noise-robust loss modeling. Extensive experiments across multiple benchmark datasets demonstrate an average 5.2% improvement in classification accuracy and a 37% reduction in label-noise sensitivity, significantly outperforming state-of-the-art active learning methods. The core contribution lies in the first unified integration of annotator capability modeling, noise-aware sampling, and robust ensemble learning within a closed-loop active learning pipeline.

Technology Category

Application Category

📝 Abstract
Active Learning (AL) has garnered significant interest across various application domains where labeling training data is costly. AL provides a framework that helps practitioners query informative samples for annotation by oracles (labelers). However, these labels often contain noise due to varying levels of labeler accuracy. Additionally, uncertain samples are more prone to receiving incorrect labels because of their complexity. Learning from imperfectly labeled data leads to an inaccurate classifier. We propose a novel AL framework to construct a robust classification model by minimizing noise levels. Our approach includes an assignment model that optimally assigns query points to labelers, aiming to minimize the maximum possible noise within each cycle. Additionally, we introduce a new sampling method to identify the best query points, reducing the impact of label noise on classifier performance. Our experiments demonstrate that our approach significantly improves classification performance compared to several benchmark methods.
Problem

Research questions and friction points this paper is trying to address.

Minimize label noise in active learning cycles
Optimally assign query points to labelers
Reduce impact of imperfect labels on classifier
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal labeler assignment to minimize noise
New sampling method to reduce label noise impact
Robust classification model from imperfect labels
🔎 Similar Papers
No similar papers found.
P
Pouya Ahadi
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
B
Blair Winograd
Ford Motor Company
C
Camille Zaug
Ford Motor Company
K
Karunesh Arora
Ford Motor Company
Lijun Wang
Lijun Wang
Zhejiang University
Statistical LearningBioinformaticsAstrophysics
Kamran Paynabar
Kamran Paynabar
Unknown affiliation