Self Distillation via Iterative Constructive Perturbations

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Balancing training performance and generalization remains challenging in deep neural network optimization. To address this, we propose a cyclic optimization framework that jointly updates model parameters and input data. Our core innovation is the Iterative Constructive Perturbation (ICP) mechanism: it generates input perturbations guided by model loss, while integrating self-distillation and intermediate-layer feature alignment to establish a bidirectional model–data adaptation paradigm. This approach unifies loss-driven input reconstruction with progressive knowledge transfer, effectively mitigating overfitting and training stagnation. Extensive experiments across diverse training regimes—including standard supervised learning, label-noise robustness, and few-shot learning—demonstrate consistent improvements in both accuracy and generalization. The framework exhibits strong robustness and broad applicability, validating its effectiveness beyond specific task assumptions.

Technology Category

Application Category

📝 Abstract
Deep Neural Networks have achieved remarkable achievements across various domains, however balancing performance and generalization still remains a challenge while training these networks. In this paper, we propose a novel framework that uses a cyclic optimization strategy to concurrently optimize the model and its input data for better training, rethinking the traditional training paradigm. Central to our approach is Iterative Constructive Perturbation (ICP), which leverages the model's loss to iteratively perturb the input, progressively constructing an enhanced representation over some refinement steps. This ICP input is then fed back into the model to produce improved intermediate features, which serve as a target in a self-distillation framework against the original features. By alternately altering the model's parameters to the data and the data to the model, our method effectively addresses the gap between fitting and generalization, leading to enhanced performance. Extensive experiments demonstrate that our approach not only mitigates common performance bottlenecks in neural networks but also demonstrates significant improvements across training variations.
Problem

Research questions and friction points this paper is trying to address.

Balancing performance and generalization in Deep Neural Networks
Cyclic optimization of model and input data for better training
Mitigating performance bottlenecks and improving training variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cyclic optimization strategy for model and data
Iterative Constructive Perturbation enhances input representation
Self-distillation framework improves intermediate features
🔎 Similar Papers
No similar papers found.
M
Maheak Dave
Techno India University
Aniket Kumar Singh
Aniket Kumar Singh
Youngstown State University & Ultium Cells
Natural Language ProcessingComputer VisionDeep LearningRenewable EnergyAI Ethics
A
Aryan Pareek
Techno India University
H
Harshita Jha
Techno India University
D
Debasis Chaudhuri
Techno India University
M
Manish Pratap Singh
DRDO Young Scientist Laboratory - Cognitive Technologies