🤖 AI Summary
To address key bottlenecks in 3D Gaussian Splatting (3DGS) video streaming—including coarse tiling granularity, absence of dedicated quality assessment, and weak bitrate adaptation—this work proposes: (1) a visual-saliency-driven spatiotemporal adaptive tiling method enabling fine-grained, semantic-aware tile partitioning; (2) the first multi-dimensional quality assessment framework tailored for 3DGS, jointly modeling spatial fidelity and rendering distortion; and (3) a meta-learning-based lightweight bitrate decision model supporting rapid cross-scene generalization. Technically, the approach integrates saliency detection, deformation field modeling, spatial-domain degradation analysis, and 2D rendering quality evaluation. Experiments under dynamic network conditions demonstrate that our solution achieves an average PSNR gain of 2.1 dB and reduces stalling rate by 37% over state-of-the-art methods, significantly improving both efficiency and stability of immersive 3D video streaming.
📝 Abstract
3D Gaussian splatting video (3DGS) streaming has recently emerged as a research hotspot in both academia and industry, owing to its impressive ability to deliver immersive 3D video experiences. However, research in this area is still in its early stages, and several fundamental challenges, such as tiling, quality assessment, and bitrate adaptation, require further investigation. In this paper, we tackle these challenges by proposing a comprehensive set of solutions. Specifically, we propose an adaptive 3DGS tiling technique guided by saliency analysis, which integrates both spatial and temporal features. Each tile is encoded into versions possessing dedicated deformation fields and multiple quality levels for adaptive selection. We also introduce a novel quality assessment framework for 3DGS video that jointly evaluates spatial-domain degradation in 3DGS representations during streaming and the quality of the resulting 2D rendered images. Additionally, we develop a meta-learning-based adaptive bitrate algorithm specifically tailored for 3DGS video streaming, achieving optimal performance across varying network conditions. Extensive experiments demonstrate that our proposed approaches significantly outperform state-of-the-art methods.