🤖 AI Summary
This work studies the convergence of stochastic gradient descent (SGD) for locally strongly convex, subquadratic-tailed objective functions—such as the Huber loss—arising in online robust and quantile regression, where the loss is first-order differentiable but not twice differentiable. We introduce a novel piecewise Lyapunov function framework, enabling the first geometric convergence analysis for merely once-differentiable objectives. Under constant step sizes, we establish weak convergence, a central limit theorem, and finite-sample bias bounds; under decaying step sizes, we derive unified finite-time convergence rates. Key theoretical advances include: (i) consistency of quantile regression without assuming continuity of the conditional density—a long-standing requirement—and (ii) robustness to heavy-tailed noise and subexponential covariates. These results substantially strengthen theoretical foundations and broaden practical applicability for robust statistical learning.
📝 Abstract
Motivated by robust and quantile regression problems, {we investigate the stochastic gradient descent (SGD) algorithm} for minimizing an objective function $f$ that is locally strongly convex with a sub--quadratic tail. This setting covers many widely used online statistical methods. We introduce a novel piecewise Lyapunov function that enables us to handle functions $f$ with only first-order differentiability, which includes a wide range of popular loss functions such as Huber loss. Leveraging our proposed Lyapunov function, we derive finite-time moment bounds under general diminishing stepsizes, as well as constant stepsizes. We further establish the weak convergence, central limit theorem and bias characterization under constant stepsize, providing the first geometrical convergence result for sub--quadratic SGD. Our results have wide applications, especially in online statistical methods. In particular, we discuss two applications of our results. 1) Online robust regression: We consider a corrupted linear model with sub--exponential covariates and heavy--tailed noise. Our analysis provides convergence rates comparable to those for corrupted models with Gaussian covariates and noise. 2) Online quantile regression: Importantly, our results relax the common assumption in prior work that the conditional density is continuous and provide a more fine-grained analysis for the moment bounds.