Indirect Gradient Matching for Adversarial Robust Distillation

📅 2023-12-06
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Small models exhibit significantly weaker adversarial robustness than large models. To address this, this paper proposes an indirect gradient matching mechanism—introducing, for the first time in adversarial distillation, the teacher model’s input gradients as transferable knowledge, thereby circumventing the instability inherent in direct gradient computation. We design an Indirect Gradient Distillation Module (IGDM) that unifies adversarial training and knowledge distillation without requiring additional data augmentation. On CIFAR-100, our method substantially improves the robustness of small models: ResNet-18 and MobileNetV2 achieve +2.26% and +3.14% accuracy gains, respectively, under AutoAttack. The core contribution lies in empirically revealing the cross-model transferability of input gradients and establishing a stable, efficient paradigm for gradient-based knowledge transfer.
📝 Abstract
Adversarial training significantly improves adversarial robustness, but superior performance is primarily attained with large models. This substantial performance gap for smaller models has spurred active research into adversarial distillation (AD) to mitigate the difference. Existing AD methods leverage the teacher's logits as a guide. In contrast to these approaches, we aim to transfer another piece of knowledge from the teacher, the input gradient. In this paper, we propose a distillation module termed Indirect Gradient Distillation Module (IGDM) that indirectly matches the student's input gradient with that of the teacher. Experimental results show that IGDM seamlessly integrates with existing AD methods, significantly enhancing their performance. Particularly, utilizing IGDM on the CIFAR-100 dataset improves the AutoAttack accuracy from 28.06% to 30.32% with the ResNet-18 architecture and from 26.18% to 29.32% with the MobileNetV2 architecture when integrated into the SOTA method without additional data augmentation.
Problem

Research questions and friction points this paper is trying to address.

Improves adversarial robustness in smaller models
Transfers input gradient knowledge from teacher to student
Enhances performance of existing adversarial distillation methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Indirect Gradient Distillation Module (IGDM)
Matches student's input gradient with teacher's
Enhances adversarial robustness without data augmentation