Convergence of Decentralized Stochastic Subgradient-based Methods for Nonsmooth Nonconvex functions

๐Ÿ“… 2024-03-18
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work studies the convergence of decentralized stochastic subgradient methods (DSGD-type algorithms) for nonsmooth, nonconvex objective functions that violate Clarke regularityโ€”such as neural networks with non-differentiable activations (e.g., ReLU). We propose a unified analytical framework that, for the first time without assuming Clarke regularity, establishes asymptotic convergence guarantees for mainstream variants including DSGD, DSGD-T, and DSGD-M. Our analysis couples discrete iterations with differential inclusions and employs a coercive Lyapunov function to characterize the stable set; under mild regularity conditions and diminishing step sizes, we prove that iterates converge almost surely to this stable set. The framework accommodates both gradient tracking and momentum mechanisms. Numerical experiments on nonsmooth distributed neural network training confirm both theoretical reliability and practical efficiency of the proposed approach.

Technology Category

Application Category

๐Ÿ“ Abstract
In this paper, we focus on the decentralized stochastic subgradient-based methods in minimizing nonsmooth nonconvex functions without Clarke regularity, especially in the decentralized training of nonsmooth neural networks. We propose a general framework that unifies various decentralized subgradient-based methods, such as decentralized stochastic subgradient descent (DSGD), DSGD with gradient-tracking technique (DSGD-T), and DSGD with momentum (DSGD-M). To establish the convergence properties of our proposed framework, we relate the discrete iterates to the trajectories of a continuous-time differential inclusion, which is assumed to have a coercive Lyapunov function with a stable set $mathcal{A}$. We prove the asymptotic convergence of the iterates to the stable set $mathcal{A}$ with sufficiently small and diminishing step-sizes. These results provide first convergence guarantees for some well-recognized of decentralized stochastic subgradient-based methods without Clarke regularity of the objective function. Preliminary numerical experiments demonstrate that our proposed framework yields highly efficient decentralized stochastic subgradient-based methods with convergence guarantees in the training of nonsmooth neural networks.
Problem

Research questions and friction points this paper is trying to address.

Decentralized optimization for nonsmooth nonconvex functions
Convergence analysis without Clarke regularity
Training nonsmooth neural networks efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized stochastic subgradient-based methods framework
Convergence via continuous-time differential inclusion analysis
Training nonsmooth neural networks with convergence guarantees
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Siyuan Zhang
State Key Laboratory of Scientific and Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, China
Nachuan Xiao
Nachuan Xiao
The Chinese University of Hong Kong, Shenzhen
Optimization
X
Xin Liu
State Key Laboratory of Scientific and Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, and University of Chinese Academy of Sciences, China