GFix: Perceptually Enhanced Gaussian Splatting Video Compression

📅 2025-11-10

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Existing 3D Gaussian Splatting (3DGS)-based video codecs suffer from prominent visual artifacts and suboptimal compression efficiency. To address this, we propose GFix—a perception-driven, content-adaptive enhancement framework that integrates a single-step diffusion model as a plug-and-play neural enhancer, coupled with a modulation-aware LoRA mechanism: low-rank weights are frozen while latent states are dynamically modulated, enabling efficient fine-tuning and lightweight updates. GFix synergistically combines 3DGS rendering, diffusion-based denoising priors, LoRA adaptation, and intermediate-state modulation to correct distortions introduced by quantization and rendering. Experiments demonstrate that GFix significantly improves reconstruction quality while preserving high compression ratios: compared to GSVC, it achieves a 72.1% BD-rate reduction in LPIPS and a 21.4% improvement in FID, validating its effectiveness in jointly optimizing perceptual fidelity and coding efficiency.

Technology Category

Application Category

📝 Abstract

3D Gaussian Splatting (3DGS) enhances 3D scene reconstruction through explicit representation and fast rendering, demonstrating potential benefits for various low-level vision tasks, including video compression. However, existing 3DGS-based video codecs generally exhibit more noticeable visual artifacts and relatively low compression ratios. In this paper, we specifically target the perceptual enhancement of 3DGS-based video compression, based on the assumption that artifacts from 3DGS rendering and quantization resemble noisy latents sampled during diffusion training. Building on this premise, we propose a content-adaptive framework, GFix, comprising a streamlined, single-step diffusion model that serves as an off-the-shelf neural enhancer. Moreover, to increase compression efficiency, We propose a modulated LoRA scheme that freezes the low-rank decompositions and modulates the intermediate hidden states, thereby achieving efficient adaptation of the diffusion backbone with highly compressible updates. Experimental results show that GFix delivers strong perceptual quality enhancement, outperforming GSVC with up to 72.1% BD-rate savings in LPIPS and 21.4% in FID.

Problem

Research questions and friction points this paper is trying to address.

Enhancing perceptual quality in 3DGS video compression

Reducing visual artifacts and improving compression ratios

Adapting diffusion models for efficient neural enhancement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Streamlined single-step diffusion model for enhancement

Modulated LoRA scheme for efficient adaptation

Content-adaptive framework improving compression efficiency

🔎 Similar Papers

MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion