An Experimental Study of Split-Learning TinyML on Ultra-Low-Power Edge/IoT Nodes

📅 2025-07-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying deep learning models on ultra-low-power edge devices—such as microcontrollers—is hindered by severe memory and computational constraints. Method: This paper introduces the first wireless performance evaluation framework for Split Learning tailored to TinyML, implemented on an end-to-end testbed based on the ESP32-S3. It enables over-the-air benchmarking across multiple protocols—ESP-NOW, BLE, UDP/IP, and TCP/IP—for intermediate activation transmission. Using 8-bit quantized MobileNetV2, it measures latency and energy consumption for transmitting 5.66 kB activation tensors. Results: UDP achieves 3.2 ms one-way latency and a steady-state RTT of 5.8 s; ESP-NOW reduces RTT to 3.7 s; BLE significantly extends device battery lifetime. The study systematically characterizes the critical trade-off between latency and energy across wireless protocols, providing empirical evidence and design guidelines for deploying Split Learning in real-world IoT environments.

Technology Category

Application Category

📝 Abstract
Running deep learning inference directly on ultra-low-power edge/IoT nodes has been limited by the tight memory and compute budgets of microcontrollers. Split learning (SL) addresses this limitation in which it executes part of the inference process on the sensor and off-loads the remainder to a companion device. In the context of constrained devices and the related impact of low-power, over-the-air transport protocols, the performance of split learning remains largely unexplored. TO the best of our knowledge, this paper presents the first end-to-end TinyML + SL testbed built on Espressif ESP32-S3 boards, designed to benchmark the over-the-air performance of split learning TinyML in edge/IoT environments. We benchmark the performance of a MobileNetV2 image recognition model, which is quantized to 8-bit integers, partitioned, and delivered to the nodes via over-the-air updates. The intermediate activations are exchanged through different wireless communication methods: ESP-NOW, BLE, and traditional UDP/IP and TCP/IP, enabling a head-to-head comparison on identical hardware. Measurements show that splitting the model after block_16_project_BN layer generates a 5.66 kB tensor that traverses the link in 3.2 ms, when UDP is used, achieving a steady-state round-trip latency of 5.8 s. ESP-NOW presents the most favorable RTT performance 3.7 s; BLE extends battery life further but increases latency beyond 10s.
Problem

Research questions and friction points this paper is trying to address.

Evaluates split-learning TinyML on ultra-low-power edge/IoT nodes
Benchmarks wireless protocols for split-learning performance
Assesses memory-compute trade-offs in microcontroller-based deep learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Split-learning TinyML on ESP32-S3 boards
Quantized 8-bit MobileNetV2 model partitioning
Wireless activation exchange via ESP-NOW/BLE/UDP/TCP