🤖 AI Summary
To address slow convergence, high communication overhead, and straggler interference in asynchronous federated learning under non-convex objectives and data heterogeneity, this paper proposes a staleness-aware asynchronous framework. The method introduces a freshness-weighted aggregation mechanism and an adaptive dynamic learning rate scheduler. It establishes, for the first time under non-convex settings, a tight convergence bound that jointly accounts for update staleness, gradient variance, and data heterogeneity. By integrating client sampling with asynchronous updates, the framework is implemented on PyTorch augmented with asyncio. Extensive experiments demonstrate substantial improvements in convergence speed and final model accuracy across diverse heterogeneous non-convex tasks—e.g., vision and language modeling—while preserving theoretical rigor and system scalability.
📝 Abstract
Federated Learning (FL) enables collaborative model training across decentralized devices while preserving data privacy. However, traditional FL suffers from communication overhead, system heterogeneity, and straggler effects. Asynchronous Federated Learning (AFL) addresses these by allowing clients to update independently, improving scalability and reducing synchronization delays. This paper extends AFL to handle non-convex objective functions and heterogeneous datasets, common in modern deep learning. We present a rigorous convergence analysis, deriving bounds on the expected gradient norm and studying the effects of staleness, variance, and heterogeneity. To mitigate stale updates, we introduce a staleness aware aggregation that prioritizes fresher updates and a dynamic learning rate schedule that adapts to client staleness and heterogeneity, improving stability and convergence. Our framework accommodates variations in computational power, data distribution, and communication delays, making it practical for real world applications. We also analyze the impact of client selection strategies-sampling with or without replacement-on variance and convergence. Implemented in PyTorch with Python's asyncio, our approach is validated through experiments demonstrating improved performance and scalability for asynchronous, heterogeneous, and non-convex FL scenarios.