CVFusion: Cross-View Fusion of 4D Radar and Camera for 3D Object Detection

📅 2025-07-06

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address the degraded 3D object detection performance under adverse weather conditions—caused by sparsity and high noise in 4D radar point clouds—this paper proposes a camera–4D radar cross-view two-stage fusion network. In the first stage, a radar-guided iterative bird’s-eye view (BEV) fusion module is introduced to improve proposal recall. In the second stage, instance-level aggregation of point cloud, image, and BEV features enables fine-grained localization and classification. Our approach innovatively integrates BEV-space alignment, iterative optimization, and complementary multimodal feature modeling to significantly enhance robustness. Extensive experiments demonstrate state-of-the-art performance: the method achieves absolute mAP improvements of 9.10% on VoD and 3.68% on TJ4DRadSet over prior art, validating its effectiveness in challenging weather scenarios.

Technology Category

Application Category

📝 Abstract

4D radar has received significant attention in autonomous driving thanks to its robustness under adverse weathers. Due to the sparse points and noisy measurements of the 4D radar, most of the research finish the 3D object detection task by integrating images from camera and perform modality fusion in BEV space. However, the potential of the radar and the fusion mechanism is still largely unexplored, hindering the performance improvement. In this study, we propose a cross-view two-stage fusion network called CVFusion. In the first stage, we design a radar guided iterative (RGIter) BEV fusion module to generate high-recall 3D proposal boxes. In the second stage, we aggregate features from multiple heterogeneous views including points, image, and BEV for each proposal. These comprehensive instance level features greatly help refine the proposals and generate high-quality predictions. Extensive experiments on public datasets show that our method outperforms the previous state-of-the-art methods by a large margin, with 9.10% and 3.68% mAP improvements on View-of-Delft (VoD) and TJ4DRadSet, respectively. Our code will be made publicly available.

Problem

Research questions and friction points this paper is trying to address.

Enhancing 3D object detection using 4D radar and camera fusion

Improving sparse and noisy 4D radar data with cross-view fusion

Boosting autonomous driving performance in adverse weather conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-view two-stage fusion network CVFusion

Radar guided iterative BEV fusion module

Aggregate features from multiple heterogeneous views

🔎 Similar Papers

No similar papers found.