Perspective from a Higher Dimension: Can 3D Geometric Priors Help Visual Floorplan Localization?

πŸ“… 2025-07-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address localization errors in 2D floor plan localization (FLoc) caused by dynamic visual changes and 3D object occlusions, this work reformulates FLoc as a 3D geometric alignment taskβ€”its first such formulation. We propose a 3D geometric prior that jointly incorporates multi-view geometric constraints and scene surface reconstruction, explicitly encoding view-invariance and view-to-plan spatial alignment to mitigate modality gaps and geometric discrepancies between images and floor plans. The method is trained end-to-end via self-supervised contrastive learning, requiring no manual annotations. Evaluated across multiple real-world building datasets, it achieves significantly higher localization success rates than state-of-the-art methods, with no increase in inference overhead. Our core contribution lies in introducing the first 3D geometric prior for FLoc that unifies viewpoint robustness and scene alignment, enabling high-accuracy, low-cost, and fully unsupervised localization.

Technology Category

Application Category

πŸ“ Abstract
Since a building's floorplans are easily accessible, consistent over time, and inherently robust to changes in visual appearance, self-localization within the floorplan has attracted researchers' interest. However, since floorplans are minimalist representations of a building's structure, modal and geometric differences between visual perceptions and floorplans pose challenges to this task. While existing methods cleverly utilize 2D geometric features and pose filters to achieve promising performance, they fail to address the localization errors caused by frequent visual changes and view occlusions due to variously shaped 3D objects. To tackle these issues, this paper views the 2D Floorplan Localization (FLoc) problem from a higher dimension by injecting 3D geometric priors into the visual FLoc algorithm. For the 3D geometric prior modeling, we first model geometrically aware view invariance using multi-view constraints, i.e., leveraging imaging geometric principles to provide matching constraints between multiple images that see the same points. Then, we further model the view-scene aligned geometric priors, enhancing the cross-modal geometry-color correspondences by associating the scene's surface reconstruction with the RGB frames of the sequence. Both 3D priors are modeled through self-supervised contrastive learning, thus no additional geometric or semantic annotations are required. These 3D priors summarized in extensive realistic scenes bridge the modal gap while improving localization success without increasing the computational burden on the FLoc algorithm. Sufficient comparative studies demonstrate that our method significantly outperforms state-of-the-art methods and substantially boosts the FLoc accuracy. All data and code will be released after the anonymous review.
Problem

Research questions and friction points this paper is trying to address.

Bridges modal gap between visual perceptions and floorplans using 3D priors
Reduces localization errors from visual changes and occlusions
Enhances cross-modal geometry-color correspondences without extra annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inject 3D geometric priors into FLoc
Model view invariance with multi-view constraints
Enhance geometry-color correspondences via reconstruction
πŸ”Ž Similar Papers
No similar papers found.
B
Bolei Chen
School of Computer Science and Engineering, Central South University
J
Jiaxu Kang
School of Computer Science and Engineering, Central South University
H
Haonan Yang
School of Computer Science and Engineering, Central South University
Ping Zhong
Ping Zhong
University of Houston
Jianxin Wang
Jianxin Wang
School of Computer Science and Engineering, Central South university
AlgorithmBioinformaticsComputer Network