SmartFLow: A Communication-Efficient SDN Framework for Cross-Silo Federated Learning

📅 2025-08-30

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In cross-institutional federated learning (FL), frequent exchange of model weights between clients and the server is highly susceptible to network congestion, resulting in high synchronization latency and degraded training efficiency. To address this, we propose an SDN-driven dynamic routing framework tailored for FL. Leveraging SDN’s global network visibility, our approach innovatively integrates FL’s inherent communication periodicity and asynchrony, and designs a lightweight, network-state-aware path scheduling algorithm that enables adaptive routing optimization during training. The framework significantly reduces parameter synchronization latency: under a 50-node topology, it achieves 47% and 41% lower synchronization time compared to shortest-path and capacity-aware routing, respectively. It incurs minimal computational overhead, exhibits strong scalability, and demonstrates practical deployability in real-world FL deployments.

Technology Category

Application Category

📝 Abstract

Cross-silo Federated Learning (FL) enables multiple institutions to collaboratively train machine learning models while preserving data privacy. In such settings, clients repeatedly exchange model weights with a central server, making the overall training time highly sensitive to network performance. However, conventional routing methods often fail to prevent congestion, leading to increased communication latency and prolonged training. Software-Defined Networking (SDN), which provides centralized and programmable control over network resources, offers a promising way to address this limitation. To this end, we propose SmartFLow, an SDN-based framework designed to enhance communication efficiency in cross-silo FL. SmartFLow dynamically adjusts routing paths in response to changing network conditions, thereby reducing congestion and improving synchronization efficiency. Experimental results show that SmartFLow decreases parameter synchronization time by up to 47% compared to shortest-path routing and 41% compared to capacity-aware routing. Furthermore, it achieves these gains with minimal computational overhead and scales effectively to networks of up to 50 clients, demonstrating its practicality for real-world FL deployments.

Problem

Research questions and friction points this paper is trying to address.

Reducing communication latency in cross-silo federated learning

Preventing network congestion during model weight exchanges

Improving synchronization efficiency with dynamic routing adjustments

Innovation

Methods, ideas, or system contributions that make the work stand out.

SDN-based framework for efficient FL communication

Dynamic routing adjustment to reduce network congestion

Minimal overhead scaling to 50 clients

🔎 Similar Papers

No similar papers found.