🤖 AI Summary
To address the low frame rates (2–9 FPS) and excessive DRAM bandwidth consumption of 3D Gaussian Splatting (3DGS) on mobile devices, this work introduces a memory-centric rendering paradigm and proposes the first voxel-driven streaming 3DGS algorithm–architecture co-design. Our approach comprises: (1) voxel-based streaming Gaussian rasterization, (2) memory-aware task scheduling, (3) hardware-accelerated DRAM traffic compression, and (4) a customized lightweight accelerator architecture. Evaluated against a mobile Ampere GPU, our design achieves up to 45.7× speedup and 62.9× energy efficiency improvement. It consistently delivers 90 FPS real-time rendering on mainstream mobile SoCs—marking the first demonstration of high-frame-rate, memory-efficient 3DGS execution on resource-constrained edge devices.
📝 Abstract
3D Gaussian Splatting (3DGS) has gained popularity for its efficiency and sparse Gaussian-based representation. However, 3DGS struggles to meet the real-time requirement of 90 frames per second (FPS) on resource-constrained mobile devices, achieving only 2 to 9 FPS.Existing accelerators focus on compute efficiency but overlook memory efficiency, leading to redundant DRAM traffic. We introduce STREAMINGGS, a fully streaming 3DGS algorithm-architecture co-design that achieves fine-grained pipelining and reduces DRAM traffic by transforming from a tile-centric rendering to a memory-centric rendering. Results show that our design achieves up to 45.7 $ imes$ speedup and 62.9 $ imes$ energy savings over mobile Ampere GPUs.