Detect and Correct: A Selective Noise Correction Method for Learning with Noisy Labels

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address performance degradation of deep models under label noise, this paper proposes a **selective noise correction method** that jointly detects and corrects noisy labels. First, it dynamically identifies potentially noisy samples based on loss distribution and separates clean and noisy subsets via a two-stage mechanism. Subsequently, it estimates a **local noise transition matrix** exclusively on the noisy subset, enabling end-to-end differentiable loss reweighting—thereby preserving all samples while avoiding overcorrection induced by global noise modeling. This is the first approach to achieve **dynamic selective correction**, balancing data integrity with correction precision. Extensive experiments on MNIST, CIFAR-10/100, and scRNA-seq cell-type annotation datasets demonstrate consistent improvements: average accuracy gains of 3.2–5.7% over state-of-the-art methods, alongside significantly enhanced robustness and generalization.

Technology Category

Application Category

📝 Abstract
Falsely annotated samples, also known as noisy labels, can significantly harm the performance of deep learning models. Two main approaches for learning with noisy labels are global noise estimation and data filtering. Global noise estimation approximates the noise across the entire dataset using a noise transition matrix, but it can unnecessarily adjust correct labels, leaving room for local improvements. Data filtering, on the other hand, discards potentially noisy samples but risks losing valuable data. Our method identifies potentially noisy samples based on their loss distribution. We then apply a selection process to separate noisy and clean samples and learn a noise transition matrix to correct the loss for noisy samples while leaving the clean data unaffected, thereby improving the training process. Our approach ensures robust learning and enhanced model performance by preserving valuable information from noisy samples and refining the correction process. We applied our method to standard image datasets (MNIST, CIFAR-10, and CIFAR-100) and a biological scRNA-seq cell-type annotation dataset. We observed a significant improvement in model accuracy and robustness compared to traditional methods.
Problem

Research questions and friction points this paper is trying to address.

Identify and correct noisy labels in datasets
Separate clean and noisy samples for targeted correction
Improve model accuracy by refining noise transition matrix
Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifies noisy samples via loss distribution analysis
Separates and corrects noisy samples selectively
Preserves clean data while refining noisy labels