🤖 AI Summary
To address the performance degradation and deployment challenges of data-driven rate control in video conferencing—caused by reliance on online training—this paper proposes a purely offline, log-driven passive learning paradigm that infers implicit optimal bitrate policies directly from production telemetry logs without online adaptation. Our approach innovatively integrates conservative counterfactual reasoning, uncertainty-aware modeling, temporal behavior cloning, and robust reinforcement learning to tackle policy generalization under feedback-free conditions. Evaluated across diverse real-world and simulated network scenarios, our method achieves 15–39% higher average video bitrate and reduces stalling rate by 60–100% compared to GCC, demonstrating substantial improvements in both practical utility and deployment feasibility.
📝 Abstract
Rate control algorithms are at the heart of video conferencing platforms, determining target bitrates that match dynamic network characteristics for high quality. Recent data-driven strategies have shown promise for this challenging task, but the performance degradation they introduce during training has been a nonstarter for many production services, precluding adoption. This paper aims to bolster the practicality of data-driven rate control by presenting an alternative avenue for experiential learning: leveraging purely existing telemetry logs produced by the incumbent algorithm in production. We observe that these logs contain effective decisions, although often at the wrong times or in the wrong order. To realize this approach despite the inherent uncertainty that log-based learning brings (i.e., lack of feedback for new decisions), our system, Tarzan, combines a variety of robust learning techniques (i.e., conservatively reasoning about alternate behavior to minimize risk and using a richer model formulation to account for environmental noise). Across diverse networks (emulated and real-world), Tarzan outperforms the widely deployed GCC algorithm, increasing average video bitrates by 15-39% while reducing freeze rates by 60-100%.