InterKey: Cross-modal Intersection Keypoints for Global Localization on OpenStreetMap

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the global localization challenge for autonomous vehicles under GNSS-denied conditions (e.g., urban canyons, tunnels), this paper proposes a cross-modal localization method leveraging OpenStreetMap (OSM) vector data and onboard LiDAR point clouds. The method treats road intersections as structured landmarks and introduces a lightweight binary descriptor. Key contributions include: (i) a discrepancy-mitigation mechanism to bridge abstraction-level and geometric-precision gaps between OSM and point clouds; (ii) orientation-adaptive intersection detection; and (iii) area-balanced sampling for robust feature alignment. The pipeline comprises intersection detection, building contour encoding, cross-modal feature matching, and compact descriptor generation. Evaluated on the KITTI dataset, the method achieves state-of-the-art localization accuracy—significantly outperforming existing approaches. It supports multi-source structured point clouds, demonstrating strong generalization and scalability across diverse urban environments.

Technology Category

Application Category

📝 Abstract
Reliable global localization is critical for autonomous vehicles, especially in environments where GNSS is degraded or unavailable, such as urban canyons and tunnels. Although high-definition (HD) maps provide accurate priors, the cost of data collection, map construction, and maintenance limits scalability. OpenStreetMap (OSM) offers a free and globally available alternative, but its coarse abstraction poses challenges for matching with sensor data. We propose InterKey, a cross-modal framework that leverages road intersections as distinctive landmarks for global localization. Our method constructs compact binary descriptors by jointly encoding road and building imprints from point clouds and OSM. To bridge modality gaps, we introduce discrepancy mitigation, orientation determination, and area-equalized sampling strategies, enabling robust cross-modal matching. Experiments on the KITTI dataset demonstrate that InterKey achieves state-of-the-art accuracy, outperforming recent baselines by a large margin. The framework generalizes to sensors that can produce dense structural point clouds, offering a scalable and cost-effective solution for robust vehicle localization.
Problem

Research questions and friction points this paper is trying to address.

Global localization for autonomous vehicles without GNSS
Matching coarse OpenStreetMap data with sensor observations
Bridging modality gaps between point clouds and map data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal intersection keypoints for localization
Binary descriptors encoding road and building imprints
Discrepancy mitigation and area-equalized sampling strategies
🔎 Similar Papers
No similar papers found.
N
Nguyen Hoang Khoi Tran
Australian Centre for Robotics (ACFR) at The University of Sydney (NSW, Australia)
J
Julie Stephany Berrio
Australian Centre for Robotics (ACFR) at The University of Sydney (NSW, Australia)
Mao Shan
Mao Shan
Australian Centre for Robotics, The University of Sydney, Australia
RoboticsV2XPerceptionC-ITS
Stewart Worrall
Stewart Worrall
ACFR, University of Sydney
Vehicle automationVehicle localisationSituation awarenessIntelligent transportation systems