Long-tailed Adversarial Training with Self-Distillation

📅 2025-03-09

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Adversarial training suffers from significantly degraded robustness on tail classes under long-tailed distributions, primarily due to the scarcity of tail-class samples. To address this, we propose a balanced self-teacher self-distillation framework specifically designed for long-tailed scenarios. Our method is the first to synergistically integrate balanced sampling with self-distillation to construct a robust self-teacher model, while jointly incorporating PGD-based adversarial training and a classification loss explicitly tailored for long-tailed distributions. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate substantial improvements in tail-class PGD robust accuracy—by 20.3, 7.1, and 3.8 percentage points, respectively—outperforming all existing approaches and establishing new state-of-the-art performance in long-tailed adversarial robustness.

Technology Category

Application Category

📝 Abstract

Adversarial training significantly enhances adversarial robustness, yet superior performance is predominantly achieved on balanced datasets. Addressing adversarial robustness in the context of unbalanced or long-tailed distributions is considerably more challenging, mainly due to the scarcity of tail data instances. Previous research on adversarial robustness within long-tailed distributions has primarily focused on combining traditional long-tailed natural training with existing adversarial robustness methods. In this study, we provide an in-depth analysis for the challenge that adversarial training struggles to achieve high performance on tail classes in long-tailed distributions. Furthermore, we propose a simple yet effective solution to advance adversarial robustness on long-tailed distributions through a novel self-distillation technique. Specifically, this approach leverages a balanced self-teacher model, which is trained using a balanced dataset sampled from the original long-tailed dataset. Our extensive experiments demonstrate state-of-the-art performance in both clean and robust accuracy for long-tailed adversarial robustness, with significant improvements in tail class performance on various datasets. We improve the accuracy against PGD attacks for tail classes by 20.3, 7.1, and 3.8 percentage points on CIFAR-10, CIFAR-100, and Tiny-ImageNet, respectively, while achieving the highest robust accuracy.

Problem

Research questions and friction points this paper is trying to address.

Enhance adversarial robustness in long-tailed datasets

Address performance drop in tail classes during adversarial training

Propose self-distillation technique for balanced adversarial training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-distillation technique enhances adversarial robustness

Balanced self-teacher model improves tail class performance

State-of-the-art accuracy against PGD attacks achieved

🔎 Similar Papers

Curriculum Dataset Distillation