🤖 AI Summary
This paper addresses Lipschitz continuous, nonsmooth, nonconvex stochastic optimization in decentralized networks—without requiring gradient information. We propose two zeroth-order distributed algorithms: DGFM and its enhanced variant DGFM+. DGFM+ is the first method to integrate randomized smoothing, gradient tracking, and variance reduction in a decentralized zeroth-order setting, incorporating a novel double-batch sampling scheme that improves the convergence complexity to $O(d^{3/2}delta^{-1}varepsilon^{-3})$. Theoretically, both algorithms are proven to converge to an $(delta,varepsilon)$-Goldstein stationary point. The framework supports flexible oracle queries—including single-sample, mini-batch, and periodic large-batch evaluations. Empirical results on real-world datasets demonstrate that DGFM+ significantly outperforms existing decentralized zeroth-order methods in terms of both convergence speed and solution quality.
📝 Abstract
We consider decentralized gradient-free optimization of minimizing Lipschitz continuous functions that satisfy neither smoothness nor convexity assumption. We propose two novel gradient-free algorithms, the Decentralized Gradient-Free Method (DGFM) and its variant, the Decentralized Gradient-Free Method+ (DGFM+). Based on the techniques of randomized smoothing and gradient tracking, DGFM requires the computation of the zeroth-order oracle of a single sample in each iteration, making it less demanding in terms of computational resources for individual computing nodes. Theoretically, DGFM achieves a complexity of O(d^(3/2)δ^(-1)ε^(-4)) for obtaining a (δ,ε)-Goldstein stationary point. DGFM+, an advanced version of DGFM, incorporates variance reduction to further improve the convergence behavior. It samples a mini-batch at each iteration and periodically draws a larger batch of data, which improves the complexity to O(d^(3/2)δ^(-1)ε^(-3)). Moreover, experimental results underscore the empirical advantages of our proposed algorithms when applied to real-world datasets.