Does Order Matter : Connecting The Law of Robustness to Robust Generalization

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the theoretical connection between robustness laws and robust generalization, addressing whether overparameterization is necessary for achieving robust interpolation and whether small robust training loss guarantees small robust test loss. By introducing a nontrivial definition of robust generalization error, the problem is reframed as analyzing lower bounds on the expected Rademacher complexity of the induced robust loss class. Leveraging Lipschitz function theory, the study establishes the first theoretical bridge between these concepts. The main contributions include proving that, under any data distribution, robust generalization does not alter the order of the Lipschitz constant required for smooth interpolation, and revealing the relationship between perturbation radius and Lipschitz scale. The analysis theoretically recovers the Ω(n^{1/d}) lower bound from Wu et al. (2023), and experiments on MNIST confirm that the scaling behavior of the Lipschitz constant aligns with theoretical predictions.

Technology Category

Application Category

📝 Abstract
Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $\Omega(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.
Problem

Research questions and friction points this paper is trying to address.

law of robustness
robust generalization
Lipschitz constant
overparameterization
Rademacher complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

robust generalization
law of robustness
Lipschitz constant
Rademacher complexity
overparameterization
🔎 Similar Papers
No similar papers found.
H
Himadri Mandal
Indian Statistical Institute, Kolkata
V
Vishnu Varadarajan
Ashoka University, Sonepat, Haryana
J
Jaee Ponde
Ashoka University, Sonepat, Haryana
Aritra Das
Aritra Das
University of Maryland, College Park
Machine learningCondensed matter theoryLattice gauge theories
M
Mihir More
Ashoka University, Sonepat, Haryana
Debayan Gupta
Debayan Gupta
Massachusetts Institute of Technology
CryptographySecure Multi-Party ComputationPrivacyDatabases