Functional Central Limit Theorem for Stochastic Gradient Descent

📅 2026-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of characterization of the long-term temporal dynamics of stochastic gradient descent (SGD) trajectories in nonsmooth convex optimization. It establishes, for the first time, a functional central limit theorem for SGD trajectories, moving beyond the conventional focus on only the final or averaged iterates. By integrating weak convergence theory of stochastic processes, diffusion process limits, and regularity conditions from convex optimization, the authors construct a continuous-time Gaussian process approximation of the trajectory. The results show that, under appropriate scaling, the SGD trajectory converges weakly to a Gaussian process, fully revealing the temporal structure of its stochastic fluctuations around the optimum. The effectiveness of this approach is demonstrated in nonsmooth robust estimation problems such as the geometric median.

Technology Category

Application Category

📝 Abstract
We study the asymptotic shape of the trajectory of the stochastic gradient descent algorithm applied to a convex objective function. Under mild regularity assumptions, we prove a functional central limit theorem for the properly rescaled trajectory. Our result characterizes the long-term fluctuations of the algorithm around the minimizer by providing a diffusion limit for the trajectory. In contrast with classical central limit theorems for the last iterate or Polyak-Ruppert averages, this functional result captures the temporal structure of the fluctuations and applies to non-smooth settings such as robust location estimation, including the geometric median.
Problem

Research questions and friction points this paper is trying to address.

Stochastic Gradient Descent
Functional Central Limit Theorem
Trajectory Fluctuations
Convex Optimization
Diffusion Limit
Innovation

Methods, ideas, or system contributions that make the work stand out.

Functional Central Limit Theorem
Stochastic Gradient Descent
Diffusion Limit
Non-smooth Optimization
Trajectory Fluctuations
🔎 Similar Papers
No similar papers found.