🤖 AI Summary
This work addresses the challenge of learning view-dependent radiance and enabling efficient rendering for large-scale, unbounded 3D scenes—traditionally incompatible with differentiable rasterization pipelines. Methodologically, it jointly optimizes triangle meshes and spherical harmonic (SH) textures, introduces a differentiable SH texel interpolation scheme, and employs deferred rendering to backpropagate pixel gradients to CPU-resident SH textures. A CPU–GPU co-managed texture system loads only frustum-visible SH texels onto GPU memory, enabling scalable training on infinite scenes. The core contribution is the first seamless integration of SH textures into standard rasterization, achieving efficient, fully differentiable, and toolchain-compatible mesh texturing. Evaluated on Replica and FAST-LIVO2, the method achieves state-of-the-art performance in both extrapolation and interpolation tasks—outperforming 3D Gaussian Splatting and M2-Mapping—while significantly reducing GPU memory consumption. Code is publicly available.
📝 Abstract
In this paper, we present a 3D reconstruction and rendering framework termed Mesh-Learner that is natively compatible with traditional rasterization pipelines. It integrates mesh and spherical harmonic (SH) texture (i.e., texture filled with SH coefficients) into the learning process to learn each mesh s view-dependent radiance end-to-end. Images are rendered by interpolating surrounding SH Texels at each pixel s sampling point using a novel interpolation method. Conversely, gradients from each pixel are back-propagated to the related SH Texels in SH textures. Mesh-Learner exploits graphic features of rasterization pipeline (texture sampling, deferred rendering) to render, which makes Mesh-Learner naturally compatible with tools (e.g., Blender) and tasks (e.g., 3D reconstruction, scene rendering, reinforcement learning for robotics) that are based on rasterization pipelines. Our system can train vast, unlimited scenes because we transfer only the SH textures within the frustum to the GPU for training. At other times, the SH textures are stored in CPU RAM, which results in moderate GPU memory usage. The rendering results on interpolation and extrapolation sequences in the Replica and FAST-LIVO2 datasets achieve state-of-the-art performance compared to existing state-of-the-art methods (e.g., 3D Gaussian Splatting and M2-Mapping). To benefit the society, the code will be available at https://github.com/hku-mars/Mesh-Learner.