🤖 AI Summary
Deep neural networks (DNNs) suffer significant performance degradation under noisy training labels. To address this, we propose a novel dual-path robust learning framework grounded in the principle of machine unlearning: first, data quality is partitioned without assuming clean label distributions via gradient attribution; second, regression sensitivity analysis identifies and prunes noise-vulnerable neurons; finally, selective fine-tuning is performed exclusively on the high-quality subset. This approach pioneers the integration of data-level unlearning with model-level interpretability-aware adaptation, eliminating the need for full retraining or explicit noise modeling. On CIFAR-10 with synthetic label noise, our method achieves an absolute accuracy improvement of approximately 10% and reduces retraining time by up to 47%. Extensive evaluations across image classification and speech recognition tasks demonstrate strong generalizability and scalability.
📝 Abstract
Deep neural networks (DNNs) have achieved remarkable success across diverse domains, but their performance can be severely degraded by noisy or corrupted training data. Conventional noise mitigation methods often rely on explicit assumptions about noise distributions or require extensive retraining, which can be impractical for large-scale models. Inspired by the principles of machine unlearning, we propose a novel framework that integrates attribution-guided data partitioning, discriminative neuron pruning, and targeted fine-tuning to mitigate the impact of noisy samples. Our approach employs gradient-based attribution to probabilistically distinguish high-quality examples from potentially corrupted ones without imposing restrictive assumptions on the noise. It then applies regression-based sensitivity analysis to identify and prune neurons that are most vulnerable to noise. Finally, the resulting network is fine-tuned on the high-quality data subset to efficiently recover and enhance its generalization performance. This integrated unlearning-inspired framework provides several advantages over conventional noise-robust learning approaches. Notably, it combines data-level unlearning with model-level adaptation, thereby avoiding the need for full model retraining or explicit noise modeling. We evaluate our method on representative tasks (e.g., CIFAR-10 image classification and speech recognition) under various noise levels and observe substantial gains in both accuracy and efficiency. For example, our framework achieves approximately a 10% absolute accuracy improvement over standard retraining on CIFAR-10 with injected label noise, while reducing retraining time by up to 47% in some settings. These results demonstrate the effectiveness and scalability of the proposed approach for achieving robust generalization in noisy environments.