Convergence of Decentralized Stochastic Subgradient-based Methods for Nonsmooth Nonconvex functions

📅 2024-03-18

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This work studies the convergence of decentralized stochastic subgradient methods (DSGD-type algorithms) for nonsmooth, nonconvex objective functions that violate Clarke regularity—such as neural networks with non-differentiable activations (e.g., ReLU). We propose a unified analytical framework that, for the first time without assuming Clarke regularity, establishes asymptotic convergence guarantees for mainstream variants including DSGD, DSGD-T, and DSGD-M. Our analysis couples discrete iterations with differential inclusions and employs a coercive Lyapunov function to characterize the stable set; under mild regularity conditions and diminishing step sizes, we prove that iterates converge almost surely to this stable set. The framework accommodates both gradient tracking and momentum mechanisms. Numerical experiments on nonsmooth distributed neural network training confirm both theoretical reliability and practical efficiency of the proposed approach.

Technology Category

Application Category

📝 Abstract

In this paper, we focus on the decentralized stochastic subgradient-based methods in minimizing nonsmooth nonconvex functions without Clarke regularity, especially in the decentralized training of nonsmooth neural networks. We propose a general framework that unifies various decentralized subgradient-based methods, such as decentralized stochastic subgradient descent (DSGD), DSGD with gradient-tracking technique (DSGD-T), and DSGD with momentum (DSGD-M). To establish the convergence properties of our proposed framework, we relate the discrete iterates to the trajectories of a continuous-time differential inclusion, which is assumed to have a coercive Lyapunov function with a stable set $mathcal{A}$. We prove the asymptotic convergence of the iterates to the stable set $mathcal{A}$ with sufficiently small and diminishing step-sizes. These results provide first convergence guarantees for some well-recognized of decentralized stochastic subgradient-based methods without Clarke regularity of the objective function. Preliminary numerical experiments demonstrate that our proposed framework yields highly efficient decentralized stochastic subgradient-based methods with convergence guarantees in the training of nonsmooth neural networks.

Problem

Research questions and friction points this paper is trying to address.

Decentralized optimization for nonsmooth nonconvex functions

Convergence analysis without Clarke regularity

Training nonsmooth neural networks efficiently

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized stochastic subgradient-based methods framework

Convergence via continuous-time differential inclusion analysis

Training nonsmooth neural networks with convergence guarantees

🔎 Similar Papers

No similar papers found.

Authors to Follow