🤖 AI Summary
Federated learning (FL) faces severe memory constraints when deployed on resource-limited mobile devices. To address this, we propose FedHybrid, the first framework integrating device selection, computational graph analysis, and resource-aware scheduling into a hybrid tensor management strategy. It achieves fine-grained, tensor-level memory optimization via activation compression, dynamic recomputation, and memory-budget-driven execution plan generation. Its core innovation lies in explicitly modeling heterogeneous client resource constraints as a computational graph reconstruction problem, enabling joint optimization of memory footprint, model accuracy, and training efficiency. Extensive experiments demonstrate that, under stringent memory budgets, FedHybrid improves model accuracy by up to 39.1% over state-of-the-art baselines while accelerating wall-clock training time by 15.5×. These results significantly advance the practical deployment of FL on memory-constrained edge devices.
📝 Abstract
Federated Learning (FL) emerges as a new learning paradigm that enables multiple devices to collaboratively train a shared model while preserving data privacy. However, one fundamental and prevailing challenge that hinders the deployment of FL on mobile devices is the memory limitation. This paper proposes extit{FedHybrid}, a novel framework that effectively reduces the memory footprint during the training process while guaranteeing the model accuracy and the overall training progress. Specifically, extit{FedHybrid} first selects the participating devices for each training round by jointly evaluating their memory budget, computing capability, and data diversity. After that, it judiciously analyzes the computational graph and generates an execution plan for each selected client in order to meet the corresponding memory budget while minimizing the training delay through employing a hybrid of recomputation and compression techniques according to the characteristic of each tensor. During the local training process, extit{FedHybrid} carries out the execution plan with a well-designed activation compression technique to effectively achieve memory reduction with minimum accuracy loss. We conduct extensive experiments to evaluate extit{FedHybrid} on both simulation and off-the-shelf mobile devices. The experiment results demonstrate that extit{FedHybrid} achieves up to a 39.1% increase in model accuracy and a 15.5$ imes$ reduction in wall clock time under various memory budgets compared with the baselines.