Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy

📅 2025-06-06

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address the high computational overhead, poor real-time performance, and low energy efficiency of 3D Gaussian Splatting (3DGS) neural rendering on mobile SoCs, this paper proposes an algorithm–hardware co-optimization framework. Methodologically, it introduces three key innovations: (1) S², the first temporal redundancy compression algorithm that explicitly models and eliminates inter-frame redundancy in Gaussian parameters; (2) a Radiance Cache (RC) mechanism that decouples rasterization frequency from color integration for the first time; and (3) LuminCore, a dedicated rasterization accelerator. Evaluated on both real-world and synthetic scenes, the design achieves 4.5× speedup and 5.3× energy-efficiency improvement over a mobile Volta GPU, with negligible PSNR degradation (<0.2 dB). This work marks the first demonstration of real-time 3DGS neural rendering on mobile platforms.

Technology Category

Application Category

📝 Abstract

3D Gaussian Splatting (3DGS) has vastly advanced the pace of neural rendering, but it remains computationally demanding on today's mobile SoCs. To address this challenge, we propose Lumina, a hardware-algorithm co-designed system, which integrates two principal optimizations: a novel algorithm, S^2, and a radiance caching mechanism, RC, to improve the efficiency of neural rendering. S2 algorithm exploits temporal coherence in rendering to reduce the computational overhead, while RC leverages the color integration process of 3DGS to decrease the frequency of intensive rasterization computations. Coupled with these techniques, we propose an accelerator architecture, LuminCore, to further accelerate cache lookup and address the fundamental inefficiencies in Rasterization. We show that Lumina achieves 4.5x speedup and 5.3x energy reduction against a mobile Volta GPU, with a marginal quality loss (<0.2 dB peak signal-to-noise ratio reduction) across synthetic and real-world datasets.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational demands of 3DGS on mobile SoCs

Exploiting temporal coherence to lower rendering overhead

Decreasing rasterization frequency via radiance caching

Innovation

Methods, ideas, or system contributions that make the work stand out.

S2 algorithm exploits temporal coherence for efficiency

Radiance caching reduces rasterization computation frequency

LuminCore accelerator enhances cache lookup speed

🔎 Similar Papers

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering