🤖 AI Summary
To address training timeout issues in O-RAN’s non-RT/near-RT RIC federated learning caused by growing model sizes, this paper proposes SplitMe—a novel split federated learning framework. Methodologically, SplitMe replaces conventional frequent gradient exchanges in split FL with a mutual learning mechanism, enabling near-RT and non-RT RICs to alternately train forward and backward submodels independently. It further reconstructs the global model via zeroth-order estimation, drastically reducing O-RAN communication overhead. Finally, it jointly optimizes computation offloading, communication resource allocation, and local update steps to guarantee convergence under strict deadline constraints. Experimental results demonstrate that SplitMe significantly outperforms state-of-the-art baselines—including SFL, FedAvg, and O-RANFed—in both convergence speed and resource efficiency, achieving up to 3.2× faster convergence and 68% lower communication cost under realistic O-RAN deployment constraints.
📝 Abstract
The hierarchical architecture of Open Radio Access Network (O-RAN) has enabled a new Federated Learning (FL) paradigm that trains models using data from non- and near-real-time (near-RT) Radio Intelligent Controllers (RICs). However, the ever-increasing model size leads to longer training time, jeopardizing the deadline requirements for both non-RT and near-RT RICs. To address this issue, split federated learning (SFL) offers an approach by offloading partial model layers from near-RT-RIC to high-performance non-RT-RIC. Nonetheless, its deployment presents two challenges: (i) Frequent data/gradient transfers between near-RT-RIC and non-RT-RIC in SFL incur significant communication cost in O-RAN. (ii) Proper allocation of computational and communication resources in O-RAN is vital to satisfying the deadline and affects SFL convergence. Therefore, we propose SplitMe, an SFL framework that exploits mutual learning to alternately and independently train the near-RT-RIC's model and the non-RT-RIC's inverse model, eliminating frequent transfers. The ''inverse'' of the inverse model is derived via a zeroth-order technique to integrate the final model. Then, we solve a joint optimization problem for SplitMe to minimize overall resource costs with deadline-aware selection of near-RT-RICs and adaptive local updates. Our numerical results demonstrate that SplitMe remarkably outperforms FL frameworks like SFL, FedAvg and O-RANFed regarding costs and convergence.