🤖 AI Summary
This work systematically investigates the convergence and generalization performance of decentralized federated learning (DFL) on real-world edge-device networks, focusing on the coupled effects of network topology, data heterogeneity (non-IIDness), and training strategies. Methodologically, it establishes the first general theoretical convergence rate for DFL, rigorously proving that convergence speed is inversely proportional to non-IID severity. A unified modeling framework is proposed—covering linear, ring, star, and mesh topologies—and validated across classical models, deep neural networks, and lightweight LLMs using real-world datasets. Results show that DFL converges to the global optimum under IID conditions; quantitatively characterize the joint impact of topology and non-IID strength on accuracy and communication efficiency; and derive reusable deployment guidelines for DFL. This study delivers the first analytical paradigm for DFL in edge AI systems that integrates rigorous theoretical guarantees with empirical benchmarks.
📝 Abstract
The widespread adoption of smartphones and smart wearable devices has led to the widespread use of Centralized Federated Learning (CFL) for training powerful machine learning models while preserving data privacy. However, CFL faces limitations due to its overreliance on a central server, which impacts latency and system robustness. Decentralized Federated Learning (DFL) is introduced to address these challenges. It facilitates direct collaboration among participating devices without relying on a central server. Each device can independently connect with other devices and share model parameters. This work explores crucial factors influencing the convergence and generalization capacity of DFL models, emphasizing network topologies, non-IID data distribution, and training strategies. We first derive the convergence rate of different DFL model deployment strategies. Then, we comprehensively analyze various network topologies (e.g., linear, ring, star, and mesh) with different degrees of non-IID data and evaluate them over widely adopted machine learning models (e.g., classical, deep neural networks, and Large Language Models) and real-world datasets. The results reveal that models converge to the optimal one for IID data. However, the convergence rate is inversely proportional to the degree of non-IID data distribution. Our findings will serve as valuable guidelines for designing effective DFL model deployments in practical applications.