🤖 AI Summary
This paper investigates the fundamental causes of adversarial vulnerability in high-dimensional linear binary classification, distinguishing statistical generalization error—arising from finite-sample limitations—from genuine adversarial misclassification induced by label-preserving perturbations.
Method: We introduce a novel error metric and, leveraging tools from high-dimensional statistics and random matrix theory, derive exact asymptotic characterizations of the existence of consistent adversarial attacks, along with closed-form asymptotic expressions for the error under both well-specified and latent-space models.
Contribution/Results: Contrary to intuition, we establish that overparameterization systematically *exacerbates*, rather than alleviates, vulnerability to label-preserving perturbations. Our analysis provides the first rigorous statistical foundation for adversarial robustness, bridging theoretical gaps among model expressivity, sample complexity, and adversarial susceptibility in high dimensions.
📝 Abstract
What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying model vulnerability to consistent adversarial attacks -- perturbations that preserve the ground-truth labels. Our main technical contribution is an exact and rigorous asymptotic characterization of these metrics in both well-specified models and latent space models, revealing different vulnerability patterns compared to standard robust error measures. The theoretical results demonstrate that as models become more overparameterized, their vulnerability to label-preserving perturbations grows, offering theoretical insight into the mechanisms underlying model sensitivity to adversarial attacks.