FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients

📅 2024-06-03

🏛️ ACM SIGMOBILE International Conference on Mobile Systems, Applications, and Services

📈 Citations: 7

✨ Influential: 0

🤖 AI Summary

To address the uneven computational burden imposed by client-side resource heterogeneity in federated learning (FL), this paper proposes FedConv. It trains lightweight submodels directly in compressed convolutional form, eliminating decompression overhead. FedConv introduces the novel “learning-on-model” paradigm—the first approach enabling end-to-end training of compressed submodels. It further designs a transposed-convolution-based expansion mechanism to unify aggregation of heterogeneous submodels while preserving personalized parameters. Complemented by joint optimization on the server using a small public dataset, FedConv achieves an average accuracy improvement of 35.2% across six benchmark datasets, while reducing computational cost by 33.1% and communication cost by 24.8%, significantly outperforming existing FL methods.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) facilitates collaborative training of a shared global model without exposing clients' private data. In practical FL systems, clients (e.g., edge servers, smartphones, and wearables) typically have disparate system resources. Conventional FL, however, adopts a one-size-fits-all solution, where a homogeneous large global model is transmitted to and trained on each client, resulting in an overwhelming workload for less capable clients and starvation for other clients. To address this issue, we propose FedConv, a client-friendly FL framework, which minimizes the computation and memory burden on resource-constrained clients by providing heterogeneous customized sub-models. FedConv features a novel learning-on-model paradigm that learns the parameters of the heterogeneous sub-models via convolutional compression. Unlike traditional compression methods, the compressed models in FedConv can be directly trained on clients without decompression. To aggregate the heterogeneous sub-models, we propose transposed convolutional dilation to convert them back to large models with a unified size while retaining personalized information from clients. The compression and dilation processes, transparent to clients, are optimized on the server leveraging a small public dataset. Extensive experiments on six datasets demonstrate that FedConv outperforms state-of-the-art FL systems in terms of model accuracy (by more than 35% on average), computation and communication overhead (with 33% and 25% reduction, respectively).

Problem

Research questions and friction points this paper is trying to address.

Addresses resource disparity in federated learning clients

Reduces computation and memory burden on constrained clients

Enhances model accuracy and reduces communication overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

Heterogeneous sub-models for resource-constrained clients

Convolutional compression for model training without decompression

Transposed convolutional dilation for model aggregation

🔎 Similar Papers

No similar papers found.

Authors to Follow