Tetris: Efficient Intra-Datacenter Calls Packing for Large Conferencing Services

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

To address CPU load imbalance among media processor (MP) servers in large-scale conferencing services (e.g., Zoom, Teams)—causing thermal hotspots and increased operational costs—this paper proposes Tetris, a two-stage call packing framework integrating historical load prediction with linear optimization. In the first stage, initial MP assignment is optimized via trajectory-based learning from historical load patterns. In the second stage, periodic linear programming drives dynamic call migration to adaptively accommodate heterogeneous call sizes, media types (e.g., audio-only vs. video-enabled), and bursty arrival patterns. Evaluated on over ten million real-world call traces, Tetris reduces the number of participants hosted on hotspot servers by more than 2.5×, significantly mitigating load skew, improving resource utilization, and enhancing system scalability.

Technology Category

Application Category

📝 Abstract

Conference services like Zoom, Microsoft Teams, and Google Meet facilitate millions of daily calls, yet ensuring high performance at low costs remains a significant challenge. This paper revisits the problem of packing calls across Media Processor (MP) servers that host the calls within individual datacenters (DCs). We show that the algorithm used in Teams -- a large scale conferencing service as well as other state-of-art algorithms are prone to placing calls resulting in some of the MPs becoming hot (high CPU utilization) that leads to degraded performance and/or elevated hosting costs. The problem arises from disregarding the variability in CPU usage among calls, influenced by differences in participant numbers and media types (audio/video), compounded by bursty call arrivals. To tackle this, we propose Tetris, a multi-step framework which (a) optimizes initial call assignments by leveraging historical data and (b) periodically migrates calls from hot MPs using linear optimization, aiming to minimize hot MP usage. Evaluation based on a 24-hour trace of over 10 million calls in one DC shows that Tetris reduces participant numbers on hot MPs by at least 2.5X.

Problem

Research questions and friction points this paper is trying to address.

Optimizing call packing across Media Processor servers to prevent hotspots.

Addressing CPU variability due to participant numbers and media types.

Reducing hosting costs and performance degradation in large conferencing services.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizes call assignments using historical data

Migrates calls from hot MPs via linear optimization

Reduces hot MPs usage by 2.5X

🔎 Similar Papers

AutoFlow: An Autoencoder-based Approach for IP Flow Record Compression with Minimal Impact on Traffic Classification