🤖 AI Summary
This work addresses 3D reconstruction and novel view synthesis for unbounded real-world scenes. We propose an implicit neural point cloud representation that jointly encodes geometric and radiometric properties within a continuous octree-based probabilistic field and a multi-resolution hash grid. This representation combines the optimization-friendly nature of implicit functions with the geometric fidelity of explicit point clouds, enabling end-to-end training without SfM initialization and differentiable extraction of high-quality explicit point clouds. By integrating implicit neural radiance fields with a differentiable bilinear rasterizer, our method achieves high-fidelity rendering at real-time interactive frame rates. On major benchmarks, it achieves state-of-the-art image quality (PSNR/SSIM/LPIPS) and significantly improves performance on downstream tasks—including semantic segmentation and depth estimation—demonstrating both representational expressiveness and practical utility.
📝 Abstract
We introduce a new approach for reconstruction and novel-view synthesis of unbounded real-world scenes. In contrast to previous methods using either volumetric fields, grid-based models, or discrete point cloud proxies, we propose a hybrid scene representation, which implicitly encodes a point cloud in a continuous octree-based probability field and a multi-resolution hash grid. In doing so, we combine the benefits of both worlds by retaining favorable behavior during optimization: Our novel implicit point cloud representation and differentiable bilinear rasterizer enable fast rendering while preserving fine geometric detail without depending on initial priors like structure-from-motion point clouds. Our method achieves state-of-the-art image quality on several common benchmark datasets. Furthermore, we achieve fast inference at interactive frame rates, and can extract explicit point clouds to further enhance performance.