🤖 AI Summary
In interactive video streaming (IVS), periodic large keyframes induce a long-tailed end-to-end (E2E) latency distribution, severely degrading real-time interactivity. Conventional optimizations—such as jitter buffering and adaptive bitrate control—struggle to jointly achieve low latency, high visual fidelity, and broad codec/protocol compatibility. To address this, we propose a *pseudo-dual-stream* architecture: dual streams are activated *only instantaneously* at keyframe intervals, decoupling the latency-sensitive playback stream from the latency-tolerant reference stream. We further design a non-keyframe-prioritized transmission mechanism and a dynamic dual-stream bitrate allocation algorithm, implemented with minimal overhead atop WebRTC. Experiments under realistic network traces show a 17.5% reduction in mean E2E latency and a 33.3% decrease in the 97th-percentile latency, while preserving original video quality throughout. To our knowledge, this is the first work to introduce the *instantaneous dual-stream* paradigm to IVS, effectively balancing real-time performance, backward compatibility, and perceptual fidelity.
📝 Abstract
End-to-end (E2E) delay is critical for interactive video streaming (IVS) experiences, but remains unsatisfactory for its long-tail distribution caused by periodic large keyframes. Conventional optimization strategies, such as jitter buffer, bitrate adaptation, and customized encoding, either sacrifice clarity, average delay, or compatibility. To address this issue, we propose PDStream, a novel pseudo-dual streaming algorithm, aimed at minimizing E2E delay while maintaining video clarity. The core idea is to split the two functions, delay-sensitive playback and delay-tolerant reference, on keyframes through dual streaming. Specifically, the playback function is held by a second parallel stream, which comprises much smaller non-keyframes and is allocated more immediate bandwidth for real-time performance. The reference function is ensured by the first stream with keyframe preservation, allocated more subsequent bandwidth to smooth out bursty traffic. Additionally, ``pseudo'' minimizes computational and transmission overheads by restricting dual streams to brief activation only when keyframes appear, supported by corresponding dual-stream bitrate allocation and adaptation to ensure delay and clarity. We implement PDStream on a WebRTC-based IVS testbed with real-world network traces. Results show that PDStream significantly outperforms prior algorithms, reducing average E2E delay by 17.5% and slashing its 97th percentile by 33.3%, while keeping clarity under varying bandwidth.