🤖 AI Summary
To address performance degradation in federated learning caused by data heterogeneity, this paper proposes a training-free, communication-efficient covariance-aware classifier initialization method. The core innovation lies in the first-ever unbiased estimation of second-order statistics (class-wise covariances) solely from first-order statistics (class means) uploaded by clients—incurred at zero additional communication cost. Based on this estimation, we construct a mean-driven linear classifier initialization and integrate it with federated fine-tuning for end-to-end optimization. Experiments demonstrate that our method improves accuracy by 4–26% over standard mean aggregation, matches the performance of covariance-transmission-based approaches, yet reduces communication overhead substantially. Moreover, it accelerates convergence and enhances final model accuracy.
📝 Abstract
Using pre-trained models has been found to reduce the effect of data heterogeneity and speed up federated learning algorithms. Recent works have investigated the use of first-order statistics and second-order statistics to aggregate local client data distributions at the server and achieve very high performance without any training. In this work we propose a training-free method based on an unbiased estimator of class covariance matrices. Our method, which only uses first-order statistics in the form of class means communicated by clients to the server, incurs only a fraction of the communication costs required by methods based on communicating second-order statistics. We show how these estimated class covariances can be used to initialize a linear classifier, thus exploiting the covariances without actually sharing them. When compared to state-of-the-art methods which also share only class means, our approach improves performance in the range of 4-26% with exactly the same communication cost. Moreover, our method achieves performance competitive or superior to sharing second-order statistics with dramatically less communication overhead. Finally, using our method to initialize classifiers and then performing federated fine-tuning yields better and faster convergence. Code is available at https://github.com/dipamgoswami/FedCOF.