FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients

📅 2024-06-03
🏛️ ACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
To address the uneven computational burden imposed by client-side resource heterogeneity in federated learning (FL), this paper proposes FedConv. It trains lightweight submodels directly in compressed convolutional form, eliminating decompression overhead. FedConv introduces the novel “learning-on-model” paradigm—the first approach enabling end-to-end training of compressed submodels. It further designs a transposed-convolution-based expansion mechanism to unify aggregation of heterogeneous submodels while preserving personalized parameters. Complemented by joint optimization on the server using a small public dataset, FedConv achieves an average accuracy improvement of 35.2% across six benchmark datasets, while reducing computational cost by 33.1% and communication cost by 24.8%, significantly outperforming existing FL methods.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) facilitates collaborative training of a shared global model without exposing clients' private data. In practical FL systems, clients (e.g., edge servers, smartphones, and wearables) typically have disparate system resources. Conventional FL, however, adopts a one-size-fits-all solution, where a homogeneous large global model is transmitted to and trained on each client, resulting in an overwhelming workload for less capable clients and starvation for other clients. To address this issue, we propose FedConv, a client-friendly FL framework, which minimizes the computation and memory burden on resource-constrained clients by providing heterogeneous customized sub-models. FedConv features a novel learning-on-model paradigm that learns the parameters of the heterogeneous sub-models via convolutional compression. Unlike traditional compression methods, the compressed models in FedConv can be directly trained on clients without decompression. To aggregate the heterogeneous sub-models, we propose transposed convolutional dilation to convert them back to large models with a unified size while retaining personalized information from clients. The compression and dilation processes, transparent to clients, are optimized on the server leveraging a small public dataset. Extensive experiments on six datasets demonstrate that FedConv outperforms state-of-the-art FL systems in terms of model accuracy (by more than 35% on average), computation and communication overhead (with 33% and 25% reduction, respectively).
Problem

Research questions and friction points this paper is trying to address.

Addresses resource disparity in federated learning clients
Reduces computation and memory burden on constrained clients
Enhances model accuracy and reduces communication overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous sub-models for resource-constrained clients
Convolutional compression for model training without decompression
Transposed convolutional dilation for model aggregation
🔎 Similar Papers
No similar papers found.
Leming Shen
Leming Shen
The Hong Kong Polytechnic University
Generative AILarge Language ModelEdge ComputingWireless Sensing
Q
Qian Yang
The Hong Kong Polytechnic University, University of Cambridge
K
Kaiyan Cui
The Hong Kong Polytechnic University, Nanjing University of Posts and Telecommunications
Yuanqing Zheng
Yuanqing Zheng
The Hong Kong Polytechnic University
Wireless NetworkingUbiquitous ComputingInternet of ThingsEmbedded AI
X
Xiaoyong Wei
Sichuan University, The Hong Kong Polytechnic University
J
Jianwei Liu
Zhejiang University
J
Jinsong Han
Zhejiang University