Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models

📅 2024-12-18

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

215K/year

🤖 AI Summary

To address performance degradation in federated learning caused by data heterogeneity, this paper proposes a training-free, communication-efficient covariance-aware classifier initialization method. The core innovation lies in the first-ever unbiased estimation of second-order statistics (class-wise covariances) solely from first-order statistics (class means) uploaded by clients—incurred at zero additional communication cost. Based on this estimation, we construct a mean-driven linear classifier initialization and integrate it with federated fine-tuning for end-to-end optimization. Experiments demonstrate that our method improves accuracy by 4–26% over standard mean aggregation, matches the performance of covariance-transmission-based approaches, yet reduces communication overhead substantially. Moreover, it accelerates convergence and enhances final model accuracy.

Technology Category

Application Category

📝 Abstract

Using pre-trained models has been found to reduce the effect of data heterogeneity and speed up federated learning algorithms. Recent works have investigated the use of first-order statistics and second-order statistics to aggregate local client data distributions at the server and achieve very high performance without any training. In this work we propose a training-free method based on an unbiased estimator of class covariance matrices. Our method, which only uses first-order statistics in the form of class means communicated by clients to the server, incurs only a fraction of the communication costs required by methods based on communicating second-order statistics. We show how these estimated class covariances can be used to initialize a linear classifier, thus exploiting the covariances without actually sharing them. When compared to state-of-the-art methods which also share only class means, our approach improves performance in the range of 4-26% with exactly the same communication cost. Moreover, our method achieves performance competitive or superior to sharing second-order statistics with dramatically less communication overhead. Finally, using our method to initialize classifiers and then performing federated fine-tuning yields better and faster convergence. Code is available at https://github.com/dipamgoswami/FedCOF.

Problem

Research questions and friction points this paper is trying to address.

Reduces communication costs in federated learning

Utilizes first-order statistics for data aggregation

Improves classifier performance without sharing second-order statistics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained models

Employs first-order statistics

Estimates class covariance matrices

🔎 Similar Papers

No similar papers found.