Characterization of Gaussian Universality Breakdown in High-Dimensional Empirical Risk Minimization
Abstract
We study high-dimensional convex empirical risk minimization (ERM) under general non-Gaussian data designs. By heuristically extending the Convex Gaussian Min-Max Theorem (CGMT) to non-Gaussian settings, we derive an asymptotic min-max characterization of key statistics, enabling approximation of the mean and covariance of the ERM estimator . Specifically, under a concentration assumption on the data matrix and standard regularity conditions on the loss and regularizer, we show that for a test covariate independent of the training data, the projection approximately follows the convolution of the (generally non-Gaussian) distribution of with an independent centered Gaussian variable of variance . This result clarifies the scope and limits of Gaussian universality for ERMs. Additionally, we prove that any regularizer is asymptotically equivalent to a quadratic form determined solely by its Hessian at zero and gradient at . Numerical simulations across diverse losses and models are provided to validate our theoretical predictions and qualitative insights.
Source: arXiv:2604.03146v1 - http://arxiv.org/abs/2604.03146v1 PDF: https://arxiv.org/pdf/2604.03146v1 Original Link: http://arxiv.org/abs/2604.03146v1