Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels

📅 2025-12-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address high label noise in active learning caused by annotator ability disparities—particularly erroneous labeling of complex instances—this paper proposes a robust, noise-resilient active learning framework. Methodologically: (1) it formulates an optimal annotator allocation model grounded in game theory, minimizing the worst-case potential noise per iteration; (2) it introduces an uncertainty-aware, noise-robust sampling strategy; and (3) it integrates multi-annotator confidence-weighted ensemble learning with noise-robust loss modeling. Extensive experiments across multiple benchmark datasets demonstrate an average 5.2% improvement in classification accuracy and a 37% reduction in label-noise sensitivity, significantly outperforming state-of-the-art active learning methods. The core contribution lies in the first unified integration of annotator capability modeling, noise-aware sampling, and robust ensemble learning within a closed-loop active learning pipeline.

Technology Category

Application Category

📝 Abstract

Active Learning (AL) has garnered significant interest across various application domains where labeling training data is costly. AL provides a framework that helps practitioners query informative samples for annotation by oracles (labelers). However, these labels often contain noise due to varying levels of labeler accuracy. Additionally, uncertain samples are more prone to receiving incorrect labels because of their complexity. Learning from imperfectly labeled data leads to an inaccurate classifier. We propose a novel AL framework to construct a robust classification model by minimizing noise levels. Our approach includes an assignment model that optimally assigns query points to labelers, aiming to minimize the maximum possible noise within each cycle. Additionally, we introduce a new sampling method to identify the best query points, reducing the impact of label noise on classifier performance. Our experiments demonstrate that our approach significantly improves classification performance compared to several benchmark methods.

Problem

Research questions and friction points this paper is trying to address.

Minimize label noise in active learning cycles

Optimally assign query points to labelers

Reduce impact of imperfect labels on classifier

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal labeler assignment to minimize noise

New sampling method to reduce label noise impact

Robust classification model from imperfect labels

🔎 Similar Papers

No similar papers found.

Authors to Follow