CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression

📅 2025-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the substantial storage and memory overheads in remote transmission of 3D Gaussian Splatting (3DGS), coupled with the absence of rate-distortion (RD) optimized compression methods, this paper proposes the first systematic RD-joint optimization framework tailored for sparse Gaussian primitives. Our method introduces three key innovations: (1) a context-adaptive tri-plane representation to model inter-primitive correlations across aligned spatial scales; (2) channel-wise autoregressive entropy coding augmented with a view-frequency-aware masking mechanism to exploit intra-primitive structural redundancy; and (3) an end-to-end trainable architecture integrating hyperprior modeling and skip coding. Implemented atop ScaffoldGS, our approach achieves state-of-the-art performance on real-world datasets: at equivalent rendering quality, it significantly reduces bitrates and delivers superior RD trade-offs.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) has recently emerged as a promising 3D representation. Much research has been focused on reducing its storage requirements and memory footprint. However, the needs to compress and transmit the 3DGS representation to the remote side are overlooked. This new application calls for rate-distortion-optimized 3DGS compression. How to quantize and entropy encode sparse Gaussian primitives in the 3D space remains largely unexplored. Few early attempts resort to the hyperprior framework from learned image compression. But, they fail to utilize fully the inter and intra correlation inherent in Gaussian primitives. Built on ScaffoldGS, this work, termed CAT-3DGS, introduces a context-adaptive triplane approach to their rate-distortion-optimized coding. It features multi-scale triplanes, oriented according to the principal axes of Gaussian primitives in the 3D space, to capture their inter correlation (i.e. spatial correlation) for spatial autoregressive coding in the projected 2D planes. With these triplanes serving as the hyperprior, we further perform channel-wise autoregressive coding to leverage the intra correlation within each individual Gaussian primitive. Our CAT-3DGS incorporates a view frequency-aware masking mechanism. It actively skips from coding those Gaussian primitives that potentially have little impact on the rendering quality. When trained end-to-end to strike a good rate-distortion trade-off, our CAT-3DGS achieves the state-of-the-art compression performance on the commonly used real-world datasets.
Problem

Research questions and friction points this paper is trying to address.

Optimizes 3D Gaussian Splatting compression for storage and transmission.
Explores quantization and entropy encoding of sparse 3D Gaussian primitives.
Introduces context-adaptive triplane approach for rate-distortion-optimized coding.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-adaptive triplane approach for 3DGS compression
Multi-scale triplanes capture spatial correlation
View frequency-aware masking optimizes coding efficiency
Y
Yu-Ting Zhan
National Yang Ming Chiao Tung University, Taiwan
C
Cheng-Yuan Ho
National Yang Ming Chiao Tung University, Taiwan
H
Hebi Yang
National Yang Ming Chiao Tung University, Taiwan
Yi-Hsin Chen
Yi-Hsin Chen
National Yang Ming Chiao Tung University
J
Jui Chiu Chiang
National Chung Cheng University, Taiwan
Yu-Lun Liu
Yu-Lun Liu
Assistant Professor, National Yang Ming Chiao Tung University
Computer VisionImage ProcessingMachine LearningDeep LearningComputational Photography
Wen-Hsiao Peng
Wen-Hsiao Peng
Professor, Computer Science, National Chiao Tung University
Video coding standardsmachine learningcomputer visionvisual signal processing