Deep Active Re-Labeling: Toward Noise-Resilient Annotation Efficiency

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep active learning suffers significant performance degradation in the presence of label noise, often underperforming passive learning. To address this issue, this work proposes a relabeling framework inspired by human learning mechanisms, which dynamically allocates a portion of the annotation budget to re-examine suspicious samples. The method employs an active noise sampling strategy to identify potentially mislabeled instances, thereby endowing the active learning process with retrospective and introspective capabilities. Within a fixed total annotation budget, this approach effectively enhances data quality and model performance, substantially narrowing the gap with the ideal noise-free scenario.
📝 Abstract
While Deep Active Learning (DAL) effectively reduces human annotation costs, its efficacy is constrained by human annotation errors. This is because the data sampled for active learning is assumed to be highly informative for training. When human annotators introduce errors into this informative data at a certain rate, the active learning performance drops significantly and, in some cases, even exhibits worse outcomes than passive learning. In this paper, we first analyze the impact of human annotation errors in the DAL setting. Then we propose a framework to address the human annotation noise problem for DAL. Informed by human learning patterns, the core idea of our proposed solution involves allocating a portion of the human annotation budget to re-annotate data that has already been labeled. Previous theoretical work suggests that when the model possesses a certain level of ability to identify potentially noisy data, even re-labeling a small fraction of the data can effectively remove noise from the active training set. To achieve this, we implement two active noise sampling strategies to detect noise under different circumstances and allocate a part of the annotation budget to re-annotate these instances. Our approach imbues active learning with a revisiting and introspective behavior. Our experiments demonstrate that, under the same annotation budget, our method is more data-efficient and yields a relatively noise-free annotation dataset in the end.
Problem

Research questions and friction points this paper is trying to address.

Deep Active Learning
Annotation Noise
Human Annotation Errors
Noise-Resilient Learning
Active Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Active Learning
Annotation Noise
Re-labeling
Active Sampling
Noise Resilience