ExplorerData ScienceStatistics
Research PaperResearchia:202604.06031

Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks

Dario Bocchi

Abstract

We analyze the one-pass stochastic gradient descent dynamics of a two-layer neural network with quadratic activations in a teacher--student framework. In the high-dimensional regime, where the input dimension $N$ and the number of samples $M$ diverge at fixed ratio $α= M/N$, and for finite hidden widths $(p,p^)$ of the student and teacher, respectively, we study the low-dimensional ordinary differential equations that govern the evolution of the student--teacher and student--student overlap matr...

Submitted: April 6, 2026Subjects: Statistics; Data Science

Description / Details

We analyze the one-pass stochastic gradient descent dynamics of a two-layer neural network with quadratic activations in a teacher--student framework. In the high-dimensional regime, where the input dimension NN and the number of samples MM diverge at fixed ratio α=M/Nα= M/N, and for finite hidden widths (p,p)(p,p^*) of the student and teacher, respectively, we study the low-dimensional ordinary differential equations that govern the evolution of the student--teacher and student--student overlap matrices. We show that overparameterization (p>pp>p^*) only modestly accelerates escape from a plateau of poor generalization by modifying the prefactor of the exponential decay of the loss. We then examine how unconstrained weight norms introduce a continuous rotational symmetry that results in a nontrivial manifold of zero-loss solutions for p>1p>1. From this manifold the dynamics consistently selects the closest solution to the random initialization, as enforced by a conserved quantity in the ODEs governing the evolution of the overlaps. Finally, a Hessian analysis of the population-loss landscape confirms that the plateau and the solution manifold correspond to saddles with at least one negative eigenvalue and to marginal minima in the population-loss geometry, respectively.


Source: arXiv:2604.03068v1 - http://arxiv.org/abs/2604.03068v1 PDF: https://arxiv.org/pdf/2604.03068v1 Original Link: http://arxiv.org/abs/2604.03068v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Apr 6, 2026
Topic:
Data Science
Area:
Statistics
Comments:
0
Bookmark
Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks | Researchia