SelectMix: Enhancing Label Noise Robustness through Targeted Sample Mixing

📅 2025-09-14

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Deep neural networks are prone to overfitting under label noise, leading to severe degradation in generalization performance. To address this, we propose SelectMix—a noise-robust learning framework comprising three key components: (1) confidence-mismatch analysis via K-fold cross-validation to precisely identify uncertain samples; (2) a class-aware selective mixing strategy that fuses only low-confidence samples with high-confidence samples from the same underlying class; and (3) soft-label alignment for mixed samples to preserve supervision fidelity and prevent noise propagation. Unlike conventional Mixup, SelectMix abandons indiscriminate interpolation and enables controllable, semantics-preserving augmentation. Extensive experiments on MNIST, CIFAR-10/100 under various synthetic noise settings, and the real-world noisy dataset Clothing1M demonstrate consistent superiority over state-of-the-art methods, validating both effectiveness and generalizability.

Technology Category

Application Category

📝 Abstract

Deep neural networks tend to memorize noisy labels, severely degrading their generalization performance. Although Mixup has demonstrated effectiveness in improving generalization and robustness, existing Mixup-based methods typically perform indiscriminate mixing without principled guidance on sample selection and mixing strategy, inadvertently propagating noisy supervision. To overcome these limitations, we propose SelectMix, a confidence-guided mixing framework explicitly tailored for noisy labels. SelectMix first identifies potentially noisy or ambiguous samples through confidence based mismatch analysis using K-fold cross-validation, then selectively blends identified uncertain samples with confidently predicted peers from their potential classes. Furthermore, SelectMix employs soft labels derived from all classes involved in the mixing process, ensuring the labels accurately represent the composition of the mixed samples, thus aligning supervision signals closely with the actual mixed inputs. Through extensive theoretical analysis and empirical evaluations on multiple synthetic (MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100) and real-world benchmark datasets (CIFAR-N, MNIST and Clothing1M), we demonstrate that SelectMix consistently outperforms strong baseline methods, validating its effectiveness and robustness in learning with noisy labels.

Problem

Research questions and friction points this paper is trying to address.

Addresses deep neural networks memorizing noisy labels

Proposes targeted sample mixing to reduce noise propagation

Enhances generalization with confidence-guided label assignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Confidence-guided mixing framework

Soft labels from mixed classes

K-fold cross-validation sample selection

🔎 Similar Papers

Can We Treat Noisy Labels as Accurate?