🤖 AI Summary
This work investigates the intrinsic relationship between generalization performance and optimization stability in Free Adversarial Training (FreeAT). Addressing the large generalization gap and poor training stability inherent in standard adversarial training, we first establish—within the algorithmic stability framework—that FreeAT achieves a tighter generalization error bound by jointly optimizing perturbations and model parameters. We further reveal that its synchronous min-max optimization mechanism is critical for narrowing the train-test accuracy gap. Theoretical analysis demonstrates that FreeAT’s generalization upper bound is significantly lower than that of standard adversarial training. Empirical evaluation confirms that, under identical iteration budgets, FreeAT consistently reduces the generalization gap by 15–22% across diverse benchmarks. Our implementation is publicly available.
📝 Abstract
While adversarial training methods have significantly improved the robustness of deep neural networks against norm-bounded adversarial perturbations, the generalization gap between their performance on training and test data is considerably greater than that of standard empirical risk minimization. Recent studies have aimed to connect the generalization properties of adversarially trained classifiers to the min-max optimization algorithm used in their training. In this work, we analyze the interconnections between generalization and optimization in adversarial training using the algorithmic stability framework. Specifically, our goal is to compare the generalization gap of neural networks trained using the vanilla adversarial training method, which fully optimizes perturbations at every iteration, with the free adversarial training method, which simultaneously optimizes norm-bounded perturbations and classifier parameters. We prove bounds on the generalization error of these methods, indicating that the free adversarial training method may exhibit a lower generalization gap between training and test samples due to its simultaneous min-max optimization of classifier weights and perturbation variables. We conduct several numerical experiments to evaluate the train-to-test generalization gap in vanilla and free adversarial training methods. Our empirical findings also suggest that the free adversarial training method could lead to a smaller generalization gap over a similar number of training iterations. The paper code is available at https://github.com/Xiwei-Cheng/Stability_FreeAT.