Decoupled Split Learning via Auxiliary Loss

📅 2026-01-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the high communication overhead and memory consumption inherent in conventional split learning, which relies on end-to-end backpropagation requiring transmission of both forward activations and backward gradients in every round. The authors propose a decoupled split learning approach that introduces an auxiliary classifier at the client-side split point to provide local loss signals, while the server trains on received activations using ground-truth labels. This design enables semi-independent model updates on both sides without exchanging backward gradients—the first such mechanism in split learning to eliminate gradient communication entirely. Experimental results demonstrate that the method achieves accuracy comparable to standard split learning on CIFAR-10 and CIFAR-100, while reducing communication volume by approximately 50% and lowering peak memory usage by up to 58%.

Technology Category

Application Category

📝 Abstract

Split learning is a distributed training paradigm where a neural network is partitioned between clients and a server, which allows data to remain at the client while only intermediate activations are shared. Traditional split learning relies on end-to-end backpropagation across the client-server split point. This incurs a large communication overhead (i.e., forward activations and backward gradients need to be exchanged every iteration) and significant memory use (for storing activations and gradients). In this paper, we develop a beyond-backpropagation training method for split learning. In this approach, the client and server train their model partitions semi-independently, using local loss signals instead of propagated gradients. In particular, the client's network is augmented with a small auxiliary classifier at the split point to provide a local error signal, while the server trains on the client's transmitted activations using the true loss function. This decoupling removes the need to send backward gradients, which cuts communication costs roughly in half and also reduces memory overhead (as each side only stores local activations for its own backward pass). We evaluate our approach on CIFAR-10 and CIFAR-100. Our experiments show two key results. First, the proposed approach achieves performance on par with standard split learning that uses backpropagation. Second, it significantly reduces communication (of transmitting activations/gradient) by 50% and peak memory usage by up to 58%.

Problem

Research questions and friction points this paper is trying to address.

split learning

communication overhead

memory usage

backpropagation

distributed training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Split Learning

Decoupled Training

Auxiliary Loss

Communication Efficiency

Memory Reduction

🔎 Similar Papers

No similar papers found.

Authors to Follow