🤖 AI Summary
Conventional wisdom holds that data heterogeneity impedes convergence in federated learning.
Method: We systematically analyze the Expectation-Maximization (EM) algorithm for Federated Mixture Linear Regression (FMLR), focusing on how the ratio $m/n$—where $m$ is the number of clients and $n$ the local sample size—quantifies data heterogeneity and affects convergence rate.
Contribution/Results: We prove that under signal-to-noise ratio $Omega(sqrt{K})$, federated EM converges globally to the minimax estimation error. Crucially, we establish for the first time that *moderate* data heterogeneity accelerates convergence: when $m$ grows exponentially, only a constant number of iterations suffices; moreover, all $m/n$ regimes achieve optimal statistical accuracy under high SNR. Synthetic experiments confirm that heterogeneity not only fails to harm convergence but actively improves it—challenging the long-standing assumption that heterogeneity inevitably slows convergence.
📝 Abstract
Data heterogeneity has been a long-standing bottleneck in studying the convergence rates of Federated Learning algorithms. In order to better understand the issue of data heterogeneity, we study the convergence rate of the Expectation-Maximization (EM) algorithm for the Federated Mixture of $K$ Linear Regressions model. We fully characterize the convergence rate of the EM algorithm under all regimes of $m/n$ where $m$ is the number of clients and $n$ is the number of data points per client. We show that with a signal-to-noise-ratio (SNR) of order $Omega(sqrt{K})$, the well-initialized EM algorithm converges within the minimax distance of the ground truth under each of the regimes. Interestingly, we identify that when $m$ grows exponentially in $n$, the EM algorithm only requires a constant number of iterations to converge. We perform experiments on synthetic datasets to illustrate our results. Surprisingly, the results show that rather than being a bottleneck, data heterogeneity can accelerate the convergence of federated learning algorithms.