🤖 AI Summary
To address performance degradation of global models in federated learning (FL) caused by statistical heterogeneity, this paper proposes a privacy-preserving personalized FL framework. Its core innovation is an early-stage quality control mechanism: clients extract representative features from their local data and perform unsupervised clustering to dynamically group participants based on data distribution similarity, enabling tailored training strategies per group. Crucially, no raw data or labels are shared, ensuring strong privacy guarantees while supporting personalization. Extensive experiments on CIFAR-10 and MNIST integrate FedAvg, SCAFFOLD, and IFCA as base optimizers. Results demonstrate that the framework significantly improves target client model accuracy—especially under low client participation rates—outperforming IFCA and other baselines in both scalability and effectiveness.
📝 Abstract
Federated learning enables collaborative model training without sharing raw data, but data heterogeneity consistently challenges the performance of the global model. Traditional optimization methods often rely on collaborative global model training involving all clients, followed by local adaptation to improve individual performance. In this work, we focus on early-stage quality control and propose PQFed, a novel privacy-preserving personalized federated learning framework that designs customized training strategies for each client prior to the federated training process. PQFed extracts representative features from each client's raw data and applies clustering techniques to estimate inter-client dataset similarity. Based on these similarity estimates, the framework implements a client selection strategy that enables each client to collaborate with others who have compatible data distributions. We evaluate PQFed on two benchmark datasets, CIFAR-10 and MNIST, integrated with three existing federated learning algorithms. Experimental results show that PQFed consistently improves the target client's model performance, even with a limited number of participants. We further benchmark PQFed against a baseline cluster-based algorithm, IFCA, and observe that PQFed also achieves better performance in low-participation scenarios. These findings highlight PQFed's scalability and effectiveness in personalized federated learning settings.