🤖 AI Summary
Existing split learning frameworks assume homogeneous client computational capabilities and fixed model partitioning points, rendering them ill-suited for real-world heterogeneous IoT environments. To address this, we propose a dynamic split learning framework tailored for heterogeneous devices, introducing the novel “heterogeneous partitioning + early-exit” co-design mechanism: each device autonomously selects its optimal model partitioning layer and early-exit point based on local compute capacity. We further design two cross-layer collaborative training strategies—sequential and averaging—to enable hierarchical model partitioning, cross-layer gradient aggregation, and distributed optimization. Evaluated on CIFAR-10, CIFAR-100, and STL-10, our framework achieves accuracy comparable to centralized training. It significantly improves participation rates of low-capability devices, enhances system throughput, and demonstrates practical compatibility across hardware tiers—from microcontrollers (MCUs) to edge GPUs.
📝 Abstract
The continuous scaling of deep neural networks has fundamentally transformed machine learning, with larger models demonstrating improved performance across diverse tasks. This growth in model size has dramatically increased the computational resources required for the training process. Consequently, distributed approaches, such as Federated Learning and Split Learning, have become essential paradigms for scalable deployment. However, existing Split Learning approaches assume client homogeneity and uniform split points across all participants. This critically limits their applicability to real-world IoT systems where devices exhibit heterogeneity in computational resources. To address this limitation, this paper proposes Hetero-SplitEE, a novel method that enables heterogeneous IoT devices to train a shared deep neural network in parallel collaboratively. By integrating heterogeneous early exits into hierarchical training, our approach allows each client to select distinct split points (cut layers) tailored to its computational capacity. In addition, we propose two cooperative training strategies, the Sequential strategy and the Averaging strategy, to facilitate this collaboration among clients with different split points. The Sequential strategy trains clients sequentially with a shared server model to reduce computational overhead. The Averaging strategy enables parallel client training with periodic cross-layer aggregation. Extensive experiments on CIFAR-10, CIFAR-100, and STL-10 datasets using ResNet-18 demonstrate that our method maintains competitive accuracy while efficiently supporting diverse computational constraints, enabling practical deployment of collaborative deep learning in heterogeneous IoT ecosystems.