🤖 AI Summary
To address the insufficient robustness of place recognition under GPS-denied conditions—caused by high noise, sparsity, and heterogeneous configurations of radar sensors—this paper proposes a LiDAR-radar fusion method based on a unified polar-coordinate bird’s-eye view (BEV) representation. The method introduces three key innovations: (1) a novel dual-branch polar-coordinate BEV architecture compatible with both single-chip solid-state and mechanically scanning radars; (2) a cross-modal cross-attention mechanism for adaptive feature-level alignment between LiDAR and radar; and (3) a radar-only knowledge distillation branch to enhance generalization under adverse weather (e.g., rain and fog). Evaluated on multiple benchmark datasets, the approach significantly improves recognition accuracy and cross-weather robustness while supporting heterogeneous radar deployment. The source code is publicly available.
📝 Abstract
In autonomous driving, place recognition is critical for global localization in GPS-denied environments. LiDAR and radar-based place recognition methods have garnered increasing attention, as LiDAR provides precise ranging, whereas radar excels in adverse weather resilience. However, effectively leveraging LiDAR-radar fusion for place recognition remains challenging. The noisy and sparse nature of radar data limits its potential to further improve recognition accuracy. In addition, heterogeneous radar configurations complicate the development of unified cross-modality fusion frameworks. In this paper, we propose LRFusionPR, which improves recognition accuracy and robustness by fusing LiDAR with either single-chip or scanning radar. Technically, a dual-branch network is proposed to fuse different modalities within the unified polar coordinate bird's eye view (BEV) representation. In the fusion branch, cross-attention is utilized to perform cross-modality feature interactions. The knowledge from the fusion branch is simultaneously transferred to the distillation branch, which takes radar as its only input to further improve the robustness. Ultimately, the descriptors from both branches are concatenated, producing the multimodal global descriptor for place retrieval. Extensive evaluations on multiple datasets demonstrate that our LRFusionPR achieves accurate place recognition, while maintaining robustness under varying weather conditions. Our open-source code will be released at https://github.com/QiZS-BIT/LRFusionPR.