🤖 AI Summary
This paper addresses the challenge of 3D Gaussian Splatting (3DGS) reconstruction under coarse camera poses and noisy LiDAR point clouds. To eliminate reliance on off-the-shelf SfM tools such as COLMAP, we propose an end-to-end joint optimization framework. Our key contributions are: (1) a hierarchical pose decoupling strategy—optimizing device-centric, world-to-camera, and camera-to-device transformations in sequence; (2) parameter-sensitivity-aware constraints combined with explicit geometric consistency regularization to enhance robustness against sensor noise; and (3) differentiable multimodal co-training integrating image and LiDAR modalities. Evaluated on a newly collected dataset and two public benchmarks, our method achieves a 27% improvement in pose accuracy and an average PSNR gain of 5.3 dB over reconstructions, significantly outperforming existing multimodal 3DGS approaches and SfM-assisted methods.
📝 Abstract
3D Gaussian Splatting (3DGS) is a powerful reconstruction technique, but it needs to be initialized from accurate camera poses and high-fidelity point clouds. Typically, the initialization is taken from Structure-from-Motion (SfM) algorithms; however, SfM is time-consuming and restricts the application of 3DGS in real-world scenarios and large-scale scene reconstruction. We introduce a constrained optimization method for simultaneous camera pose estimation and 3D reconstruction that does not require SfM support. Core to our approach is decomposing a camera pose into a sequence of camera-to-(device-)center and (device-)center-to-world optimizations. To facilitate, we propose two optimization constraints conditioned to the sensitivity of each parameter group and restricts each parameter's search space. In addition, as we learn the scene geometry directly from the noisy point clouds, we propose geometric constraints to improve the reconstruction quality. Experiments demonstrate that the proposed method significantly outperforms the existing (multi-modal) 3DGS baseline and methods supplemented by COLMAP on both our collected dataset and two public benchmarks.