π€ AI Summary
Neural rendering methods (e.g., NeRF, 3DGS) for large-scale unbounded outdoor scenes heavily rely on Structure-from-Motion (SfM) to provide camera poses and sparse geometric priorsβa major bottleneck in practical deployment. To address this, we propose the first end-to-end 3D Gaussian Splatting (3DGS) reconstruction framework that eliminates SfM entirely. Our method introduces two key innovations: (1) an ICP-guided differentiable pose estimation module that jointly optimizes camera trajectories and scene geometry-appearance; and (2) a voxelization-guided adaptive Gaussian densification strategy, significantly improving robustness under weak texture, low image overlap, and large camera motion. Evaluated on multi-scale indoor and outdoor datasets, our approach reduces pose estimation error by 37% and improves novel-view synthesis PSNR by 2.1 dB over SfM-dependent baselines, demonstrating state-of-the-art performance in SfM-free neural reconstruction.
π Abstract
In recent years, neural rendering methods such as NeRFs and 3D Gaussian Splatting (3DGS) have made significant progress in scene reconstruction and novel view synthesis. However, they heavily rely on preprocessed camera poses and 3D structural priors from structure-from-motion (SfM), which are challenging to obtain in outdoor scenarios. To address this challenge, we propose to incorporate Iterative Closest Point (ICP) with optimization-based refinement to achieve accurate camera pose estimation under large camera movements. Additionally, we introduce a voxel-based scene densification approach to guide the reconstruction in large-scale scenes. Experiments demonstrate that our approach ICP-3DGS outperforms existing methods in both camera pose estimation and novel view synthesis across indoor and outdoor scenes of various scales. Source code is available at https://github.com/Chenhao-Z/ICP-3DGS.