DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Existing end-to-end autonomous driving systems rely on multi-task heads to separately model perception, prediction, and planning. Although differentiable, such architectures exhibit weak task coupling and poor cross-task coordination. This paper proposes the first diffusion probabilistic modeling framework for autonomous driving, reformulating driving decision-making as a conditional bird’s-eye-view (BEV) image generation task. Heterogeneous driving entities are uniformly rasterized into a shared BEV grid; their joint distribution is modeled via latent variables, and perception–prediction–planning is jointly optimized through iterative denoising sampling. By eliminating task-specific head separation, the approach significantly enhances inter-task synergy and system robustness. Evaluated in closed-loop CARLA simulations, our method achieves new state-of-the-art performance, attaining superior Success Rate and Driving Score compared to prior approaches.

Technology Category

Application Category

📝 Abstract

End-to-end autonomous driving (E2E-AD) has rapidly emerged as a promising approach toward achieving full autonomy. However, existing E2E-AD systems typically adopt a traditional multi-task framework, addressing perception, prediction, and planning tasks through separate task-specific heads. Despite being trained in a fully differentiable manner, they still encounter issues with task coordination, and the system complexity remains high. In this work, we introduce DiffAD, a novel diffusion probabilistic model that redefines autonomous driving as a conditional image generation task. By rasterizing heterogeneous targets onto a unified bird's-eye view (BEV) and modeling their latent distribution, DiffAD unifies various driving objectives and jointly optimizes all driving tasks in a single framework, significantly reducing system complexity and harmonizing task coordination. The reverse process iteratively refines the generated BEV image, resulting in more robust and realistic driving behaviors. Closed-loop evaluations in Carla demonstrate the superiority of the proposed method, achieving a new state-of-the-art Success Rate and Driving Score. The code will be made publicly available.

Problem

Research questions and friction points this paper is trying to address.

Unifies perception, prediction, and planning in autonomous driving.

Reduces system complexity and improves task coordination.

Achieves state-of-the-art performance in driving simulations.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified diffusion model for autonomous driving tasks

Rasterizes targets into bird's-eye view for optimization

Iterative refinement enhances robust driving behaviors

🔎 Similar Papers

No similar papers found.