Perspective from a Higher Dimension: Can 3D Geometric Priors Help Visual Floorplan Localization?

📅 2025-07-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address localization errors in 2D floor plan localization (FLoc) caused by dynamic visual changes and 3D object occlusions, this work reformulates FLoc as a 3D geometric alignment task—its first such formulation. We propose a 3D geometric prior that jointly incorporates multi-view geometric constraints and scene surface reconstruction, explicitly encoding view-invariance and view-to-plan spatial alignment to mitigate modality gaps and geometric discrepancies between images and floor plans. The method is trained end-to-end via self-supervised contrastive learning, requiring no manual annotations. Evaluated across multiple real-world building datasets, it achieves significantly higher localization success rates than state-of-the-art methods, with no increase in inference overhead. Our core contribution lies in introducing the first 3D geometric prior for FLoc that unifies viewpoint robustness and scene alignment, enabling high-accuracy, low-cost, and fully unsupervised localization.

Technology Category

Application Category

📝 Abstract

Since a building's floorplans are easily accessible, consistent over time, and inherently robust to changes in visual appearance, self-localization within the floorplan has attracted researchers' interest. However, since floorplans are minimalist representations of a building's structure, modal and geometric differences between visual perceptions and floorplans pose challenges to this task. While existing methods cleverly utilize 2D geometric features and pose filters to achieve promising performance, they fail to address the localization errors caused by frequent visual changes and view occlusions due to variously shaped 3D objects. To tackle these issues, this paper views the 2D Floorplan Localization (FLoc) problem from a higher dimension by injecting 3D geometric priors into the visual FLoc algorithm. For the 3D geometric prior modeling, we first model geometrically aware view invariance using multi-view constraints, i.e., leveraging imaging geometric principles to provide matching constraints between multiple images that see the same points. Then, we further model the view-scene aligned geometric priors, enhancing the cross-modal geometry-color correspondences by associating the scene's surface reconstruction with the RGB frames of the sequence. Both 3D priors are modeled through self-supervised contrastive learning, thus no additional geometric or semantic annotations are required. These 3D priors summarized in extensive realistic scenes bridge the modal gap while improving localization success without increasing the computational burden on the FLoc algorithm. Sufficient comparative studies demonstrate that our method significantly outperforms state-of-the-art methods and substantially boosts the FLoc accuracy. All data and code will be released after the anonymous review.

Problem

Research questions and friction points this paper is trying to address.

Bridges modal gap between visual perceptions and floorplans using 3D priors

Reduces localization errors from visual changes and occlusions

Enhances cross-modal geometry-color correspondences without extra annotations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inject 3D geometric priors into FLoc

Model view invariance with multi-view constraints

Enhance geometry-color correspondences via reconstruction

🔎 Similar Papers

No similar papers found.

Authors to Follow