Uno: A One-Stop Solution for Inter- and Intra-Datacenter Congestion Control and Reliable Connectivity

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In cloud- and AI-driven data centers, intra-datacenter (intra-DC) and inter-datacenter (inter-DC) traffic coexist, but their significantly divergent round-trip times (RTTs) cause congestion control mismatch: intra-DC flows react rapidly and monopolize bandwidth, harming rate fairness; inter-DC flows suffer slow loss recovery, degrading reliability. Existing solutions employ fragmented, isolated control mechanisms, failing to jointly ensure fairness and robustness. Method: We propose the first unified congestion control and connection management architecture, integrating RTT-aware fast feedback, fair rate allocation, erasure-code-enhanced load balancing, and dynamic adaptive routing—all within a single protocol stack. Contribution/Results: Our design jointly optimizes latency, throughput, fairness, and reliability. Experiments show that, compared to Gemini, flow completion times improve by 32% for inter-DC and 24% for intra-DC traffic, with substantial gains in end-to-end communication efficiency and fairness.

Technology Category

Application Category

📝 Abstract
Cloud computing and AI workloads are driving unprecedented demand for efficient communication within and across datacenters. However, the coexistence of intra- and inter-datacenter traffic within datacenters plus the disparity between the RTTs of intra- and inter-datacenter networks complicates congestion management and traffic routing. Particularly, faster congestion responses of intra-datacenter traffic causes rate unfairness when competing with slower inter-datacenter flows. Additionally, inter-datacenter messages suffer from slow loss recovery and, thus, require reliability. Existing solutions overlook these challenges and handle inter- and intra-datacenter congestion with separate control loops or at different granularities. We propose Uno, a unified system for both inter- and intra-DC environments that integrates a transport protocol for rapid congestion reaction and fair rate control with a load balancing scheme that combines erasure coding and adaptive routing. Our findings show that Uno significantly improves the completion times of both inter- and intra-DC flows compared to state-of-the-art methods such as Gemini.
Problem

Research questions and friction points this paper is trying to address.

Managing congestion control across datacenter networks with varying RTTs
Addressing rate unfairness between intra- and inter-datacenter traffic flows
Providing reliable connectivity and fast loss recovery for inter-datacenter messages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified system for inter- and intra-datacenter congestion control
Integrates transport protocol with erasure coding
Combines adaptive routing with fair rate control
🔎 Similar Papers
No similar papers found.
T
Tommaso Bonato
ETH Zürich, Zürich, Switzerland and Microsoft, Redmond, USA
S
Sepehr Abdous
Johns Hopkins University, Baltimore, USA and Microsoft, Redmond, USA
Abdul Kabbani
Abdul Kabbani
Principal Architect, Microsoft and Adjunct Associate Professor, University of California
Systems and Networking
A
Ahmad Ghalayini
Microsoft, Redmond, USA
N
Nadeen Gebara
Microsoft, Redmond, USA
T
Terry Lam
Microsoft, Redmond, USA
A
Anup Agarwal
Carnegie Mellon University, Pittsburgh, USA
T
Tiancheng Chen
ETH Zürich, Zürich, Switzerland
Zhuolong Yu
Zhuolong Yu
Microsoft
Konstantin Taranov
Konstantin Taranov
ETH Zurich
Networked systems
M
Mahmoud Elhaddad
Microsoft, Redmond, USA
Daniele De Sensi
Daniele De Sensi
Tenure-Track Assistant Professor, Sapienza University of Rome
High Performance ComputingInterconnection NetworksPower-Aware Computing
Soudeh Ghorbani
Soudeh Ghorbani
Johns Hopkins University, Baltimore, USA
Torsten Hoefler
Torsten Hoefler
Professor of Computer Science at ETH Zurich
High Performance ComputingDeep LearningNetworkingMessage Passing InterfaceParallel and Distributed Computing