🤖 AI Summary
To address systematic forgetting of the global decision boundary by clients during local training in Non-IID federated learning, this paper proposes FedProj. Methodologically, it introduces (1) a server-side knowledge transfer loss that regularizes local gradient updates to preserve global boundary information; (2) an episodic memory mechanism leveraging public unlabeled data to dynamically store and replay ensemble logits for boundary calibration. Theoretically, FedProj is the first to formally characterize—and empirically validate—the mechanism by which local updates degrade the global decision boundary. Experimentally, on multiple Non-IID benchmarks, FedProj consistently outperforms state-of-the-art methods, improving global generalization accuracy by 3.2–7.8% while effectively mitigating model forgetting. Its design bridges theoretical insight with practical deployability, offering both rigorous analysis and engineering efficiency.
📝 Abstract
The inevitable presence of data heterogeneity has made federated learning very challenging. There are numerous methods to deal with this issue, such as local regularization, better model fusion techniques, and data sharing. Though effective, they lack a deep understanding of how data heterogeneity can affect the global decision boundary. In this paper, we bridge this gap by performing an experimental analysis of the learned decision boundary using a toy example. Our observations are surprising: (1) we find that the existing methods suffer from forgetting and clients forget the global decision boundary and only learn the perfect local one, and (2) this happens regardless of the initial weights, and clients forget the global decision boundary even starting from pre-trained optimal weights. In this paper, we present FedProj, a federated learning framework that robustly learns the global decision boundary and avoids its forgetting during local training. To achieve better ensemble knowledge fusion, we design a novel server-side ensemble knowledge transfer loss to further calibrate the learned global decision boundary. To alleviate the issue of learned global decision boundary forgetting, we further propose leveraging an episodic memory of average ensemble logits on a public unlabeled dataset to regulate the gradient updates at each step of local training. Experimental results demonstrate that FedProj outperforms state-of-the-art methods by a large margin.