GFix: Perceptually Enhanced Gaussian Splatting Video Compression

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D Gaussian Splatting (3DGS)-based video codecs suffer from prominent visual artifacts and suboptimal compression efficiency. To address this, we propose GFix—a perception-driven, content-adaptive enhancement framework that integrates a single-step diffusion model as a plug-and-play neural enhancer, coupled with a modulation-aware LoRA mechanism: low-rank weights are frozen while latent states are dynamically modulated, enabling efficient fine-tuning and lightweight updates. GFix synergistically combines 3DGS rendering, diffusion-based denoising priors, LoRA adaptation, and intermediate-state modulation to correct distortions introduced by quantization and rendering. Experiments demonstrate that GFix significantly improves reconstruction quality while preserving high compression ratios: compared to GSVC, it achieves a 72.1% BD-rate reduction in LPIPS and a 21.4% improvement in FID, validating its effectiveness in jointly optimizing perceptual fidelity and coding efficiency.

Technology Category

Application Category

📝 Abstract
3D Gaussian Splatting (3DGS) enhances 3D scene reconstruction through explicit representation and fast rendering, demonstrating potential benefits for various low-level vision tasks, including video compression. However, existing 3DGS-based video codecs generally exhibit more noticeable visual artifacts and relatively low compression ratios. In this paper, we specifically target the perceptual enhancement of 3DGS-based video compression, based on the assumption that artifacts from 3DGS rendering and quantization resemble noisy latents sampled during diffusion training. Building on this premise, we propose a content-adaptive framework, GFix, comprising a streamlined, single-step diffusion model that serves as an off-the-shelf neural enhancer. Moreover, to increase compression efficiency, We propose a modulated LoRA scheme that freezes the low-rank decompositions and modulates the intermediate hidden states, thereby achieving efficient adaptation of the diffusion backbone with highly compressible updates. Experimental results show that GFix delivers strong perceptual quality enhancement, outperforming GSVC with up to 72.1% BD-rate savings in LPIPS and 21.4% in FID.
Problem

Research questions and friction points this paper is trying to address.

Enhancing perceptual quality in 3DGS video compression
Reducing visual artifacts and improving compression ratios
Adapting diffusion models for efficient neural enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Streamlined single-step diffusion model for enhancement
Modulated LoRA scheme for efficient adaptation
Content-adaptive framework improving compression efficiency
🔎 Similar Papers
No similar papers found.
S
Siyue Teng
Visual Information Laboratory, University of Bristol, Bristol, BS1 5DD, United Kingdom
G
Ge Gao
Visual Information Laboratory, University of Bristol, Bristol, BS1 5DD, United Kingdom
Duolikun Danier
Duolikun Danier
Postdoc, University of Edinburgh
video processingvideo generationcomputer visionmachine learning
Y
Yuxuan Jiang
Visual Information Laboratory, University of Bristol, Bristol, BS1 5DD, United Kingdom
F
Fan Zhang
Visual Information Laboratory, University of Bristol, Bristol, BS1 5DD, United Kingdom
T
Thomas Davis
Visionular Inc., Los Altos, CA 94022 USA
Z
Zoe Liu
Visionular Inc., Los Altos, CA 94022 USA
D
David R. Bull
Visual Information Laboratory, University of Bristol, Bristol, BS1 5DD, United Kingdom