🤖 AI Summary
This work addresses the non-convex optimization dynamics of wide neural networks under distributed gradient descent (DGD) in serverless peer-to-peer learning settings—e.g., smart city edge networks. Method: We propose a novel dynamical framework that unifies neural tangent kernel (NTK) theory with distributed consensus analysis, enabling explicit characterization of parameter evolution and test error convergence over arbitrary peer-to-peer topologies. Contribution/Results: Unlike conventional centralized modeling paradigms, our framework accurately predicts parameter trajectories and generalization performance for classification tasks—without modifying network architecture or tuning hyperparameters. Theoretical analyses and empirical evaluations exhibit strong agreement across diverse edge-network topologies and data distributions. This provides an interpretable, predictive theoretical foundation for resource-constrained, deployment-rigid edge collaborative learning systems.
📝 Abstract
Peer-to-peer learning is an increasingly popular framework that enables beyond-5G distributed edge devices to collaboratively train deep neural networks in a privacy-preserving manner without the aid of a central server. Neural network training algorithms for emerging environments, e.g., smart cities, have many design considerations that are difficult to tune in deployment settings -- such as neural network architectures and hyperparameters. This presents a critical need for characterizing the training dynamics of distributed optimization algorithms used to train highly nonconvex neural networks in peer-to-peer learning environments. In this work, we provide an explicit characterization of the learning dynamics of wide neural networks trained using popular distributed gradient descent (DGD) algorithms. Our results leverage both recent advancements in neural tangent kernel (NTK) theory and extensive previous work on distributed learning and consensus. We validate our analytical results by accurately predicting the parameter and error dynamics of wide neural networks trained for classification tasks.