SpaRC-AD: A Baseline for Radar-Camera Fusion in End-to-End Autonomous Driving

πŸ“… 2025-08-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing end-to-end autonomous driving methods suffer from significant limitations in adverse weather conditions, occlusions, and accurate velocity estimation, hindering motion understanding and long-horizon trajectory prediction in safety-critical scenarios. To address these challenges, we propose the first planning-driven, query-based radar-camera fusion framework. Our method enhances cross-modal spatial consistency and temporal coherence through sparse 3D feature alignment, Doppler-assisted velocity estimation, agent-centric anchor optimization, map polyline modeling, and joint motion prediction. Evaluated on nuScenes, T-nuScenes, and Bench2Drive, our approach achieves a 4.8% improvement in 3D detection mAP, an 8.3% gain in multi-object tracking AMOTA, and a 9% reduction in trajectory planning error (TPC) over vision-only baselines.

Technology Category

Application Category

πŸ“ Abstract
End-to-end autonomous driving systems promise stronger performance through unified optimization of perception, motion forecasting, and planning. However, vision-based approaches face fundamental limitations in adverse weather conditions, partial occlusions, and precise velocity estimation - critical challenges in safety-sensitive scenarios where accurate motion understanding and long-horizon trajectory prediction are essential for collision avoidance. To address these limitations, we propose SpaRC-AD, a query-based end-to-end camera-radar fusion framework for planning-oriented autonomous driving. Through sparse 3D feature alignment, and doppler-based velocity estimation, we achieve strong 3D scene representations for refinement of agent anchors, map polylines and motion modelling. Our method achieves strong improvements over the state-of-the-art vision-only baselines across multiple autonomous driving tasks, including 3D detection (+4.8% mAP), multi-object tracking (+8.3% AMOTA), online mapping (+1.8% mAP), motion prediction (-4.0% mADE), and trajectory planning (-0.1m L2 and -9% TPC). We achieve both spatial coherence and temporal consistency on multiple challenging benchmarks, including real-world open-loop nuScenes, long-horizon T-nuScenes, and closed-loop simulator Bench2Drive. We show the effectiveness of radar-based fusion in safety-critical scenarios where accurate motion understanding and long-horizon trajectory prediction are essential for collision avoidance. The source code of all experiments is available at https://phi-wol.github.io/sparcad/
Problem

Research questions and friction points this paper is trying to address.

Overcome vision limitations in adverse weather and occlusions
Improve velocity estimation for collision avoidance
Enhance 3D scene representation for autonomous driving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Query-based camera-radar fusion framework
Sparse 3D feature alignment technique
Doppler-based velocity estimation method
πŸ”Ž Similar Papers
No similar papers found.