🤖 AI Summary
To address geometric misalignment, modality-specific sparsity discrepancies, and unidirectional guidance limitations in 4D radar–LiDAR fusion, this paper proposes a bidirectional mutual-perception enhanced fusion framework. It leverages radar-informed features to guide joint geometric learning across modalities, while inversely compensating radar BEV features using LiDAR’s high-fidelity shape priors. Key innovations include multimodal feature alignment, BEV feature distillation, geometry–semantics co-modeling, and temporal 4D radar feature extraction. Evaluated on the View-of-Delft dataset, our method achieves a full-scene 3D detection mAP of 71.76% and a drivable corridor mAP of 86.36%. For passenger cars, AP improves by 4.17–4.20 percentage points over state-of-the-art unidirectional fusion approaches, demonstrating substantial gains in both accuracy and robustness.
📝 Abstract
Radar and LiDAR have been widely used in autonomous driving as LiDAR provides rich structure information, and radar demonstrates high robustness under adverse weather. Recent studies highlight the effectiveness of fusing radar and LiDAR point clouds. However, challenges remain due to the modality misalignment and information loss during feature extractions. To address these issues, we propose a 4D radar-LiDAR framework to mutually enhance their representations. Initially, the indicative features from radar are utilized to guide both radar and LiDAR geometric feature learning. Subsequently, to mitigate their sparsity gap, the shape information from LiDAR is used to enrich radar BEV features. Extensive experiments on the View-of-Delft (VoD) dataset demonstrate our approach's superiority over existing methods, achieving the highest mAP of 71.76% across the entire area and 86.36% within the driving corridor. Especially for cars, we improve the AP by 4.17% and 4.20% due to the strong indicative features and symmetric shapes.