Improving Adversarial Robustness Through Adaptive Learning-Driven Multi-Teacher Knowledge Distillation

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Convolutional neural networks (CNNs) are highly vulnerable to adversarial attacks, and achieving both high robustness and clean-data accuracy remains challenging. Method: This paper proposes a novel student-training paradigm that requires no adversarial examples—namely, an adaptive robustness transfer framework based on multi-teacher knowledge distillation. Heterogeneous teacher models are constructed via adversarial training, and a dynamic weighting mechanism—guided by per-teacher prediction accuracy—adaptively fuses their soft-label knowledge to supervise the student model exclusively on clean data. Contribution/Results: Experiments on MNIST-Digits and Fashion-MNIST demonstrate that the student model achieves substantial robustness gains against FGSM, PGD, and CW attacks (average robust accuracy improvement of 12.3%), while maintaining exceptional clean-data accuracy (>99%). To our knowledge, this is the first method to enable efficient, multi-attack robustness transfer solely from clean data.

Technology Category

Application Category

📝 Abstract

Convolutional neural networks (CNNs) excel in computer vision but are susceptible to adversarial attacks, crafted perturbations designed to mislead predictions. Despite advances in adversarial training, a gap persists between model accuracy and robustness. To mitigate this issue, in this paper, we present a multi-teacher adversarial robustness distillation using an adaptive learning strategy. Specifically, our proposed method first trained multiple clones of a baseline CNN model using an adversarial training strategy on a pool of perturbed data acquired through different adversarial attacks. Once trained, these adversarially trained models are used as teacher models to supervise the learning of a student model on clean data using multi-teacher knowledge distillation. To ensure an effective robustness distillation, we design an adaptive learning strategy that controls the knowledge contribution of each model by assigning weights as per their prediction precision. Distilling knowledge from adversarially pre-trained teacher models not only enhances the learning capabilities of the student model but also empowers it with the capacity to withstand different adversarial attacks, despite having no exposure to adversarial data. To verify our claims, we extensively evaluated our proposed method on MNIST-Digits and Fashion-MNIST datasets across diverse experimental settings. The obtained results exhibit the efficacy of our multi-teacher adversarial distillation and adaptive learning strategy, enhancing CNNs' adversarial robustness against various adversarial attacks.

Problem

Research questions and friction points this paper is trying to address.

Enhancing CNN robustness against adversarial attacks

Bridging accuracy-robustness gap in adversarial training

Distilling multi-teacher knowledge via adaptive learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-teacher adversarial robustness distillation

Adaptive learning strategy for weight assignment

Knowledge transfer from adversarially trained models

🔎 Similar Papers

No similar papers found.

Authors to Follow