SORA: Free Second-Order Attacks in Fast Adversarial Training

πŸ“… 2026-05-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

210K/year
πŸ€– AI Summary
This work addresses the severe drop in robustness against multi-step attacks caused by catastrophic overfitting in single-step fast adversarial training. To mitigate Ξ΅-overfitting, the authors propose an adaptive adversarial training method that introduces perturbation variability and dynamically adjusts the perturbation step size based on the geometry of the loss landscape. Key contributions include a novel perspective on Ξ΅-overfitting, a lightweight perturbation alignment metric termed PertAlign, and SORAβ€”a strategy for leveraging second-order information without additional computational overhead. Evaluated across diverse datasets and model architectures, the method achieves state-of-the-art robustness and clean accuracy using a single hyperparameter setting, significantly outperforming existing fast adversarial training approaches.
πŸ“ Abstract
Adversarial Training (AT) is a leading defense against adversarial examples but often suffers from Catastrophic Overfitting (CO) in efficient single-step variants, where robustness to multi-step attacks collapses despite high single-step performance. We address this failure mode with two contributions. First, we formalize Epsilon Overfitting (EO), a perspective in which fixed perturbation magnitudes and directions exacerbate CO, and show that introducing perturbation variability significantly improves robust generalization across different architectures and datasets. Second, we propose PertAlign (Perturbation Alignment), a theoretically grounded, computationally negligible metric that predicts CO onset by measuring gradient alignment across attack stages. Leveraging these insights, we introduce SORA, an adaptive step-size AT method that dynamically adjusts perturbations based on loss surface geometry. SORA consistently prevents CO, achieves state-of-the-art robustness and clean accuracy, and generalizes across datasets and architectures using a single fixed set of hyperparameters, which is essential for applicability in fast AT. Extensive experiments on diverse datasets and architectures show that SORA matches or surpasses the robustness of prior methods while delivering higher clean accuracy and superior efficiency. Code is available at https://github.com/SecondOrderAT/SORA.
Problem

Research questions and friction points this paper is trying to address.

Catastrophic Overfitting
Adversarial Training
Robustness
Single-step Attacks
Multi-step Attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Catastrophic Overfitting
Epsilon Overfitting
Perturbation Alignment
Adaptive Step-size
Fast Adversarial Training
πŸ”Ž Similar Papers
No similar papers found.