Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning

📅 2024-07-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In overparameterized learning, it remains unclear which global minima are dynamically stable and thus practically attainable by stochastic gradient descent (SGD) versus deterministic gradient descent. Method: The authors introduce the Lyapunov exponent—derived from local dynamical systems theory—as a rigorous stability criterion for minima, modeling SGD via stochastic differential equations and analyzing its non-convex optimization dynamics through Lyapunov stability theory. Contribution/Results: They establish the first theoretically grounded dynamic stability criterion for SGD: the sign of the Lyapunov exponent determines whether SGD concentrates near a given minimum. Crucially, they prove that implicit regularization in overparameterized settings arises precisely from this dynamic stability selection mechanism—i.e., SGD preferentially converges to minima with negative Lyapunov exponents. This framework provides a novel dynamical-systems perspective on generalization bias in overparameterized models, unifying implicit regularization with stability of stochastic optimization trajectories.

Technology Category

Application Category

📝 Abstract
For overparameterized optimization tasks, such as the ones found in modern machine learning, global minima are generally not unique. In order to understand generalization in these settings, it is vital to study to which minimum an optimization algorithm converges. The possibility of having minima that are unstable under the dynamics imposed by the optimization algorithm limits the potential minima that the algorithm can find. In this paper, we characterize the global minima that are dynamically stable/unstable for both deterministic and stochastic gradient descent (SGD). In particular, we introduce a characteristic Lyapunov exponent which depends on the local dynamics around a global minimum and rigorously prove that the sign of this Lyapunov exponent determines whether SGD can accumulate at the respective global minimum.
Problem

Research questions and friction points this paper is trying to address.

Characterize stability of global minima in overparameterized learning
Determine which minima SGD converges to dynamically
Introduce Lyapunov exponent to predict SGD stability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Characterizes stable minima for SGD dynamics
Introduces Lyapunov exponent for stability analysis
Proves sign determines SGD convergence behavior
🔎 Similar Papers
No similar papers found.
D
Dennis Chemnitz
Fachbereich Mathematik und Informatik, Freie Universität Berlin, 14195 Berlin, Germany
Maximilian Engel
Maximilian Engel
University of Amsterdam, FU Berlin
Stochastic and Multiscale Dynamical Systems