Does Order Matter : Connecting The Law of Robustness to Robust Generalization

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work investigates the theoretical connection between robustness laws and robust generalization, addressing whether overparameterization is necessary for achieving robust interpolation and whether small robust training loss guarantees small robust test loss. By introducing a nontrivial definition of robust generalization error, the problem is reframed as analyzing lower bounds on the expected Rademacher complexity of the induced robust loss class. Leveraging Lipschitz function theory, the study establishes the first theoretical bridge between these concepts. The main contributions include proving that, under any data distribution, robust generalization does not alter the order of the Lipschitz constant required for smooth interpolation, and revealing the relationship between perturbation radius and Lipschitz scale. The analysis theoretically recovers the Ω(n^{1/d}) lower bound from Wu et al. (2023), and experiments on MNIST confirm that the scaling behavior of the Lipschitz constant aligns with theoretical predictions.

Technology Category

Application Category

📝 Abstract

Bubeck and Sellke (2021) pose as an open problem the connection between the law of robustness and robust generalization. The law of robustness states that overparameterization is necessary for models to interpolate robustly; in particular, robust interpolation requires the learned function to be Lipschitz. Robust generalization asks whether small robust training loss implies small robust test loss. We resolve this problem by explicitly connecting the two for arbitrary data distributions. Specifically, we introduce a nontrivial notion of robust generalization error and convert it into a lower bound on the expected Rademacher complexity of the induced robust loss class. Our bounds recover the $\Omega(n^{1/d})$ regime of Wu et al.\ (2023) and show that, up to constants, robust generalization does not change the order of the Lipschitz constant required for smooth interpolation. We conduct experiments to probe the predicted scaling with dataset size and model capacity, testing whether empirical behavior aligns more closely with the predictions of Bubeck and Sellke (2021) or Wu et al.\ (2023). For MNIST, we find that the lower-bound Lipschitz constant scales on the order predicted by Wu et al.\ (2023). Informally, to obtain low robust generalization error, the Lipschitz constant must lie in a range that we bound, and the allowable perturbation radius is linked to the Lipschitz scale.

Problem

Research questions and friction points this paper is trying to address.

law of robustness

robust generalization

Lipschitz constant

overparameterization

Rademacher complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

robust generalization

law of robustness

Lipschitz constant

Rademacher complexity

overparameterization

🔎 Similar Papers

No similar papers found.

Authors to Follow