DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving

Wang, Tao, Zhang, Cong, Qu, Xingguang, Li, Kun, Liu, Weiwei, Huang, Chang

Mar-15-2025–arXiv.org Artificial Intelligence

End-to-end autonomous driving (E2E-AD) has rapidly emerged as a promising approach toward achieving full autonomy. However, existing E2E-AD systems typically adopt a traditional multi-task framework, addressing perception, prediction, and planning tasks through separate task-specific heads. Despite being trained in a fully differentiable manner, they still encounter issues with task coordination, and the system complexity remains high. In this work, we introduce DiffAD, a novel diffusion probabilistic model that redefines autonomous driving as a conditional image generation task. By rasterizing heterogeneous targets onto a unified bird's-eye view (BEV) and modeling their latent distribution, DiffAD unifies various driving objectives and jointly optimizes all driving tasks in a single framework, significantly reducing system complexity and harmonizing task coordination. The reverse process iteratively refines the generated BEV image, resulting in more robust and realistic driving behaviors. Closed-loop evaluations in Carla demonstrate the superiority of the proposed method, achieving a new state-of-the-art Success Rate and Driving Score. The code will be made publicly available.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Mar-15-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Israel (0.14)

Genre:
- Research Report > Promising Solution (0.54)

Industry:
- Automobiles & Trucks (0.84)
- Information Technology > Robotics & Automation (0.84)
- Transportation > Ground
  - Road (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.67)
  - Natural Language (0.94)
  - Representation & Reasoning (1.00)
  - Robots > Autonomous Vehicles (0.84)
  - Vision (1.00)