🤖 AI Summary
Existing robustness-based generalization error bounds are often excessively loose or even vacuous in practice, failing to reflect the true generalization capability of deep models. This work proposes a locally adaptive approach to constructing generalization bounds by introducing notions of local robustness and stability. The input space is partitioned into subregions, and the robustness term is scaled according to the proportion of stable versus unstable samples within each region, yielding a tighter, data- and model-dependent upper bound. This method is the first to refine global robustness measures into localized forms, substantially mitigating the vacuity commonly observed in traditional bounds. Experiments on ImageNet demonstrate that the proposed bound remains non-vacuous across various robust deep networks and closely tracks their empirical generalization errors, significantly outperforming existing approaches.
📝 Abstract
Generalization is a critical property of data-driven models, particularly deep learning models deployed in safety-critical applications. Robustness-based generalization bounds have gained attention as a principled way to link robustness properties to generalization performance, often in a data-dependent manner. However, most existing bounds suffer from vacuousness in practical settings, yielding loose upper bounds that greatly exceed the actual error rates and limiting their usefulness for real-world evaluation. While this issue is often attributed to the uncertainty term, a substantial part of the problem originates from the robustness term itself, particularly for the 0-1 loss. Existing approaches typically treat the robustness term as a global measure, ignoring its variation across different sub-regions of the input space. In this work, we propose a generalization bound that addresses this limitation by scaling the robustness term according to the number of stable and unstable samples within each sub-region. Our bounds incorporate both data- and model-dependent factors while maintaining practical relevance (yielding tighter upper bounds on true error). Experiments on models trained on the ImageNet dataset show that our bounds remain consistently non-vacuous and achieve the tightest estimates among existing methods, closely aligning with empirical performance across a range of robust deep neural networks.