🤖 AI Summary
To address high training latency in wireless distributed learning caused by constrained communication and computation resources, this paper proposes a synergistic optimization framework that integrates split learning (SL) and federated learning (FL). We jointly optimize the selection of learning paradigms (SL vs. FL), batch size, and communication/computation resource allocation—explicitly capturing their coupled effects. A novel integer non-convex optimization model is formulated, and a two-stage algorithm is designed: first solving its continuous relaxation via block coordinate descent, then recovering feasible integer batch sizes through a customized rounding scheme. Experiments demonstrate that, to achieve a target model accuracy, the proposed method reduces overall learning latency by up to 42.3% compared to state-of-the-art baselines, while significantly improving convergence speed, system efficiency, and final model accuracy.
📝 Abstract
Federated learning (FL) and split learning (SL) are two effective distributed learning paradigms in wireless networks, enabling collaborative model training across mobile devices without sharing raw data. While FL supports low-latency parallel training, it may converge to less accurate model. In contrast, SL achieves higher accuracy through sequential training but suffers from increased delay. To leverage the advantages of both, hybrid split and federated learning (HSFL) allows some devices to operate in FL mode and others in SL mode. This paper aims to accelerate HSFL by addressing three key questions: 1) How does learning mode selection affect overall learning performance? 2) How does it interact with batch size? 3) How can these hyperparameters be jointly optimized alongside communication and computational resources to reduce overall learning delay? We first analyze convergence, revealing the interplay between learning mode and batch size. Next, we formulate a delay minimization problem and propose a two-stage solution: a block coordinate descent method for a relaxed problem to obtain a locally optimal solution, followed by a rounding algorithm to recover integer batch sizes with near-optimal performance. Experimental results demonstrate that our approach significantly accelerates convergence to the target accuracy compared to existing methods.