🤖 AI Summary
This work investigates the breakdown of Gaussian universality for high-dimensional empirical risk minimization (ERM) under non-Gaussian designs. By extending the convex Gaussian min-max theorem to non-Gaussian settings and leveraging concentration inequalities, random matrix theory, and asymptotic analysis, the authors establish an asymptotic min-max characterization of the mean and covariance of ERM estimators and uncover the structure of their predictive distributions. A central contribution is delineating the precise boundary under which Gaussian universality remains valid, and proving that, asymptotically, any twice-differentiable regularizer is equivalent to a quadratic form determined by its gradient and Hessian. Numerical experiments confirm the theoretical predictions, demonstrating accurate characterization of ERM’s statistical behavior across diverse loss functions and data models.
📝 Abstract
We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max characterization of key statistics, enabling approximation of the mean $μ_{\hatθ}$ and covariance $C_{\hatθ}$ of the ERM estimator $\hatθ$. Specifically, under a concentration assumption on the data matrix and standard regularity conditions on the loss and regularizer, we show that for a test covariate $x$ independent of the training data, the projection $\hatθ^\top x$ approximately follows the convolution of the (generally non-Gaussian) distribution of $μ_{\hatθ}^\top x$ with an independent centered Gaussian variable of variance $\text{Tr}(C_{\hatθ}\mathbb{E}[xx^\top])$. This result clarifies the scope and limits of Gaussian universality for ERMs. Additionally, we prove that any $\mathcal{C}^2$ regularizer is asymptotically equivalent to a quadratic form determined solely by its Hessian at zero and gradient at $μ_{\hatθ}$. Numerical simulations across diverse losses and models are provided to validate our theoretical predictions and qualitative insights.