π€ AI Summary
This work addresses the severe drop in robustness against multi-step attacks caused by catastrophic overfitting in single-step fast adversarial training. To mitigate Ξ΅-overfitting, the authors propose an adaptive adversarial training method that introduces perturbation variability and dynamically adjusts the perturbation step size based on the geometry of the loss landscape. Key contributions include a novel perspective on Ξ΅-overfitting, a lightweight perturbation alignment metric termed PertAlign, and SORAβa strategy for leveraging second-order information without additional computational overhead. Evaluated across diverse datasets and model architectures, the method achieves state-of-the-art robustness and clean accuracy using a single hyperparameter setting, significantly outperforming existing fast adversarial training approaches.
π Abstract
Adversarial Training (AT) is a leading defense against adversarial examples but often suffers from Catastrophic Overfitting (CO) in efficient single-step variants, where robustness to multi-step attacks collapses despite high single-step performance. We address this failure mode with two contributions. First, we formalize Epsilon Overfitting (EO), a perspective in which fixed perturbation magnitudes and directions exacerbate CO, and show that introducing perturbation variability significantly improves robust generalization across different architectures and datasets. Second, we propose PertAlign (Perturbation Alignment), a theoretically grounded, computationally negligible metric that predicts CO onset by measuring gradient alignment across attack stages. Leveraging these insights, we introduce SORA, an adaptive step-size AT method that dynamically adjusts perturbations based on loss surface geometry. SORA consistently prevents CO, achieves state-of-the-art robustness and clean accuracy, and generalizes across datasets and architectures using a single fixed set of hyperparameters, which is essential for applicability in fast AT. Extensive experiments on diverse datasets and architectures show that SORA matches or surpasses the robustness of prior methods while delivering higher clean accuracy and superior efficiency. Code is available at https://github.com/SecondOrderAT/SORA.