Gesplat: Robust Pose-Free 3D Reconstruction via Geometry-Guided Gaussian Splatting

📅 2025-10-11

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing NeRF- and 3D Gaussian Splatting-based methods suffer significant degradation in novel view synthesis and geometric reconstruction under sparse-view settings, primarily due to unreliable camera pose estimation and insufficient supervision. This paper proposes a pose-free robust 3D reconstruction framework. First, an initial dense point cloud and coarse camera poses are generated using VGGT. Second, a hybrid Gaussian representation is introduced, jointly optimizing both spatial positions and shape parameters. Third, a graph-guided cross-view attribute refinement module enforces geometric and appearance consistency across views, while optical-flow-driven depth regularization further enhances geometric fidelity. Evaluated on forward-facing and large-scale complex scene benchmarks, our method substantially outperforms existing pose-free approaches, achieving superior novel view synthesis quality and geometrically consistent scene reconstruction.

Technology Category

Application Category

📝 Abstract

Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have advanced 3D reconstruction and novel view synthesis, but remain heavily dependent on accurate camera poses and dense viewpoint coverage. These requirements limit their applicability in sparse-view settings, where pose estimation becomes unreliable and supervision is insufficient. To overcome these challenges, we introduce Gesplat, a 3DGS-based framework that enables robust novel view synthesis and geometrically consistent reconstruction from unposed sparse images. Unlike prior works that rely on COLMAP for sparse point cloud initialization, we leverage the VGGT foundation model to obtain more reliable initial poses and dense point clouds. Our approach integrates several key innovations: 1) a hybrid Gaussian representation with dual position-shape optimization enhanced by inter-view matching consistency; 2) a graph-guided attribute refinement module to enhance scene details; and 3) flow-based depth regularization that improves depth estimation accuracy for more effective supervision. Comprehensive quantitative and qualitative experiments demonstrate that our approach achieves more robust performance on both forward-facing and large-scale complex datasets compared to other pose-free methods.

Problem

Research questions and friction points this paper is trying to address.

Overcoming camera pose dependency in sparse-view 3D reconstruction

Enabling robust novel view synthesis from unposed sparse images

Achieving geometrically consistent reconstruction without accurate camera poses

Innovation

Methods, ideas, or system contributions that make the work stand out.

VGGT foundation model for reliable initial poses

Hybrid Gaussian representation with dual optimization

Flow-based depth regularization for accurate supervision

🔎 Similar Papers

No similar papers found.