🤖 AI Summary
In large-scale camera relocalization, existing 3D point/line map regression methods suffer from high computational cost and overfitting due to monolithic joint encoding. To address these issues, this paper proposes a decoupled point-line feature learning framework with priority-weighted fusion. We design two independent subnetworks—specifically for point and line feature encoding—to eliminate representational redundancy between modalities. Furthermore, we introduce a geometry-aware attention mechanism enabling end-to-end joint regression of 3D position and orientation. Evaluated on standard benchmarks—including 7Scenes and Cambridge Landmarks—our method achieves significant improvements over state-of-the-art approaches: point-line regression accuracy increases by up to 12.6%, while inference speed improves by 3.2×. The source code is publicly available.
📝 Abstract
In this paper, we present a new approach for improving 3D point and line mapping regression for camera re-localization. Previous methods typically rely on feature matching (FM) with stored descriptors or use a single network to encode both points and lines. While FM-based methods perform well in large-scale environments, they become computationally expensive with a growing number of mapping points and lines. Conversely, approaches that learn to encode mapping features within a single network reduce memory footprint but are prone to overfitting, as they may capture unnecessary correlations between points and lines. We propose that these features should be learned independently, each with a distinct focus, to achieve optimal accuracy. To this end, we introduce a new architecture that learns to prioritize each feature independently before combining them for localization. Experimental results demonstrate that our approach significantly enhances the 3D map point and line regression performance for camera re-localization. The implementation of our method will be publicly available at: https://github.com/ais-lab/pl2map/.