SG2Loc: Sequential Visual Localization on 3D Scene Graphs

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes a lightweight sequence-based visual localization method leveraging a compact 3D scene graph to address the high storage overhead and inefficient sequential pose optimization commonly encountered in complex indoor environments. By integrating the 3D scene graph into a particle filtering framework, the approach incrementally refines camera poses across image sequences through patch-level semantic feature matching and coarse object mesh projection. Experimental results on real-world datasets demonstrate that the method significantly reduces storage requirements while maintaining high localization accuracy, thereby achieving efficient and scalable indoor visual localization performance.

📝 Abstract

Visual localization in complex indoor environments remains a critical challenge for robotics and AR applications. Sequential localization, where pose estimates are refined over time, is important for autonomous agents. However, traditional methods often require storing extensive image databases or point clouds, leading to significant overhead. This paper introduces a novel, lightweight approach to sequential visual localization using 3D scene graphs. Our method represents the environment with a compact scene graph, where nodes represent objects (with coarse meshes) and edges encode spatial relationships. For each image in the localization phase, we extract per-patch semantic features, predicting object identities. Localization is performed within a particle filter framework. Each particle, representing a camera pose, projects the coarse object meshes from the scene graph into the image, assigning object identities to patches based on visibility. The similarity of the per-patch features, in the input image, and object features from the scene graph determines the weight of a particle. Subsequent images are incorporated sequentially, refining the pose estimate. By leveraging a compact scene graph and efficient semantic matching, our method significantly reduces storage while maintaining performance on real-world datasets. The code will be available at https://github.com/DmblnNicole/sg2loc.

Problem

Research questions and friction points this paper is trying to address.

visual localization

indoor environments

sequential localization

3D scene graphs

storage overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D scene graph

sequential visual localization

semantic feature matching