Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization

📅 2026-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the breakdown of Gaussian universality for high-dimensional empirical risk minimization (ERM) under non-Gaussian designs. By extending the convex Gaussian min-max theorem to non-Gaussian settings and leveraging concentration inequalities, random matrix theory, and asymptotic analysis, the authors establish an asymptotic min-max characterization of the mean and covariance of ERM estimators and uncover the structure of their predictive distributions. A central contribution is delineating the precise boundary under which Gaussian universality remains valid, and proving that, asymptotically, any twice-differentiable regularizer is equivalent to a quadratic form determined by its gradient and Hessian. Numerical experiments confirm the theoretical predictions, demonstrating accurate characterization of ERM’s statistical behavior across diverse loss functions and data models.
📝 Abstract
We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max characterization of key statistics, enabling approximation of the mean $μ_{\hatθ}$ and covariance $C_{\hatθ}$ of the ERM estimator $\hatθ$. Specifically, under a concentration assumption on the data matrix and standard regularity conditions on the loss and regularizer, we show that for a test covariate $x$ independent of the training data, the projection $\hatθ^\top x$ approximately follows the convolution of the (generally non-Gaussian) distribution of $μ_{\hatθ}^\top x$ with an independent centered Gaussian variable of variance $\text{Tr}(C_{\hatθ}\mathbb{E}[xx^\top])$. This result clarifies the scope and limits of Gaussian universality for ERMs. Additionally, we prove that any $\mathcal{C}^2$ regularizer is asymptotically equivalent to a quadratic form determined solely by its Hessian at zero and gradient at $μ_{\hatθ}$. Numerical simulations across diverse losses and models are provided to validate our theoretical predictions and qualitative insights.
Problem

Research questions and friction points this paper is trying to address.

Gaussian universality
empirical risk minimization
high-dimensional statistics
non-Gaussian data
asymptotic characterization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian universality breakdown
Convex Gaussian Min-Max Theorem
high-dimensional ERM
non-Gaussian data design
asymptotic equivalence of regularizers
🔎 Similar Papers
No similar papers found.
C
Chiheb Yaakoubi
School of Data Science, Chinese University of Hong Kong (Shenzhen), Shenzhen
Cosme Louart
Cosme Louart
Assistant Professor, Chinese University of Hong Kong, Shenzhen
Random matricesConcentration of the measureMachine learning
M
Malik Tiomoko
School of Electronic Information and Communications, Huazhong University of Science and Technology (HUST), Wuhan
Zhenyu Liao
Zhenyu Liao
Huazhong University of Science and Technology
Machine LearningRandom Matrix TheoryHigh-dimensional Statistics