🤖 AI Summary
To address the degradation in reconstruction quality, excessive bandwidth and storage overhead, and difficulties in real-time streaming deployment of 4D Gaussian Splatting (4DGS) for long sequences, this paper proposes a lightweight, low-latency 4D Gaussian streaming framework tailored for real-time Free-Viewpoint Video (FVV) transmission. Methodologically, we formulate Gaussian parameter transmission as an integer linear programming problem—the first such formulation in the literature—and introduce an adaptive Gaussian update pruning strategy. Our framework integrates multi-channel 2D video coding, keyframe-driven reconstruction, and spatiotemporal consistency modeling, augmented by a dilation loss that jointly optimizes compression ratio, reconstruction fidelity, and training efficiency. Experiments demonstrate stable frame-level PSNR ≥30 dB with >20% reduction in PSNR fluctuation; 6× faster training; and nearly 50% reduction in per-frame transmission size—significantly outperforming state-of-the-art 4DGS approaches.
📝 Abstract
Free-viewpoint video (FVV) enables immersive viewing experiences by allowing users to view scenes from arbitrary perspectives. As a prominent reconstruction technique for FVV generation, 4D Gaussian Splatting (4DGS) models dynamic scenes with time-varying 3D Gaussian ellipsoids and achieves high-quality rendering via fast rasterization. However, existing 4DGS approaches suffer from quality degradation over long sequences and impose substantial bandwidth and storage overhead, limiting their applicability in real-time and wide-scale deployments. Therefore, we present AirGS, a streaming-optimized 4DGS framework that rearchitects the training and delivery pipeline to enable high-quality, low-latency FVV experiences. AirGS converts Gaussian video streams into multi-channel 2D formats and intelligently identifies keyframes to enhance frame reconstruction quality. It further combines temporal coherence with inflation loss to reduce training time and representation size. To support communication-efficient transmission, AirGS models 4DGS delivery as an integer linear programming problem and design a lightweight pruning level selection algorithm to adaptively prune the Gaussian updates to be transmitted, balancing reconstruction quality and bandwidth consumption. Extensive experiments demonstrate that AirGS reduces quality deviation in PSNR by more than 20% when scene changes, maintains frame-level PSNR consistently above 30, accelerates training by 6 times, reduces per-frame transmission size by nearly 50% compared to the SOTA 4DGS approaches.