Enhancing Parallelism in Decentralized Stochastic Convex Optimization

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Decentralized stochastic convex optimization (SCO) suffers from a parallel scalability bottleneck: convergence degrades significantly when the number of machines exceeds a critical threshold. To address this, we propose Decentralized Anytime SGD (DA-SGD), the first method provably improving the critical parallelism limit within the SCO framework and narrowing the statistical performance gap between decentralized and centralized learning. DA-SGD integrates anytime iteration design, graph signal processing, and rigorous convergence analysis to achieve centralized-optimal convergence rates on highly connected topologies. Theoretically, it establishes a tighter upper bound on achievable parallelism than state-of-the-art methods. Empirically, DA-SGD demonstrates no performance degradation under multi-machine scaling—maintaining stable convergence and preserving solution accuracy without loss in precision.

Technology Category

Application Category

📝 Abstract
Decentralized learning has emerged as a powerful approach for handling large datasets across multiple machines in a communication-efficient manner. However, such methods often face scalability limitations, as increasing the number of machines beyond a certain point negatively impacts convergence rates. In this work, we propose Decentralized Anytime SGD, a novel decentralized learning algorithm that significantly extends the critical parallelism threshold, enabling the effective use of more machines without compromising performance. Within the stochastic convex optimization (SCO) framework, we establish a theoretical upper bound on parallelism that surpasses the current state-of-the-art, allowing larger networks to achieve favorable statistical guarantees and closing the gap with centralized learning in highly connected topologies.
Problem

Research questions and friction points this paper is trying to address.

Enhancing parallelism in decentralized stochastic convex optimization
Overcoming scalability limits in decentralized learning with more machines
Closing performance gap between decentralized and centralized learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized Anytime SGD algorithm
Extends critical parallelism threshold
Theoretical upper bound on parallelism
🔎 Similar Papers
No similar papers found.
O
Ofri Eisen
Department of Electrical and Computer Engineering, Technion, Haifa, Israel
Ron Dorfman
Ron Dorfman
PhD Student, Technion - Israel Institute of Technology
Machine LearningStochastic Optimization
K
K. Levy
Department of Electrical and Computer Engineering, Technion, Haifa, Israel