FedHybrid: Breaking the Memory Wall of Federated Learning via Hybrid Tensor Management

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Federated learning (FL) faces severe memory constraints when deployed on resource-limited mobile devices. To address this, we propose FedHybrid, the first framework integrating device selection, computational graph analysis, and resource-aware scheduling into a hybrid tensor management strategy. It achieves fine-grained, tensor-level memory optimization via activation compression, dynamic recomputation, and memory-budget-driven execution plan generation. Its core innovation lies in explicitly modeling heterogeneous client resource constraints as a computational graph reconstruction problem, enabling joint optimization of memory footprint, model accuracy, and training efficiency. Extensive experiments demonstrate that, under stringent memory budgets, FedHybrid improves model accuracy by up to 39.1% over state-of-the-art baselines while accelerating wall-clock training time by 15.5×. These results significantly advance the practical deployment of FL on memory-constrained edge devices.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) emerges as a new learning paradigm that enables multiple devices to collaboratively train a shared model while preserving data privacy. However, one fundamental and prevailing challenge that hinders the deployment of FL on mobile devices is the memory limitation. This paper proposes extit{FedHybrid}, a novel framework that effectively reduces the memory footprint during the training process while guaranteeing the model accuracy and the overall training progress. Specifically, extit{FedHybrid} first selects the participating devices for each training round by jointly evaluating their memory budget, computing capability, and data diversity. After that, it judiciously analyzes the computational graph and generates an execution plan for each selected client in order to meet the corresponding memory budget while minimizing the training delay through employing a hybrid of recomputation and compression techniques according to the characteristic of each tensor. During the local training process, extit{FedHybrid} carries out the execution plan with a well-designed activation compression technique to effectively achieve memory reduction with minimum accuracy loss. We conduct extensive experiments to evaluate extit{FedHybrid} on both simulation and off-the-shelf mobile devices. The experiment results demonstrate that extit{FedHybrid} achieves up to a 39.1% increase in model accuracy and a 15.5$ imes$ reduction in wall clock time under various memory budgets compared with the baselines.
Problem

Research questions and friction points this paper is trying to address.

Reducing memory footprint in federated learning training
Managing memory constraints on mobile devices effectively
Optimizing tensor handling through hybrid recomputation and compression
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid tensor management for memory reduction
Joint device selection based on multiple capabilities
Execution plan with recomputation and compression techniques
🔎 Similar Papers
No similar papers found.
K
Kahou Tam
State Key Laboratory of IoTSC, University of Macau, Macau SAR, China
Chunlin Tian
Chunlin Tian
University of Macau
MLSys
L
Li Li
State Key Laboratory of IoTSC, University of Macau, Macau SAR, China
H
Haikai Zhao
Simon Fraser University, Canada
C
ChengZhong Xu
State Key Laboratory of IoTSC, University of Macau, Macau SAR, China