Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

Zheng, Yinan, Liang, Ruiming, Zheng, Kexin, Zheng, Jinliang, Mao, Liyuan, Li, Jianxiong, Gu, Weihao, Ai, Rui, Li, Shengbo Eben, Zhan, Xianyuan, Liu, Jingjing

arXiv.org Artificial Intelligence 

Achieving human-like driving behaviors in complex open-world environments is a critical challenge in autonomous driving. Contemporary learning-based planning approaches such as imitation learning methods often struggle to balance competing objectives and lack of safety assurance, due to limited adaptability and inadequacy in learning complex multi-modal behaviors commonly exhibited in human planning, not to mention their strong reliance on the fallback strategy with predefined rules. We propose a novel transformer-based Diffusion Planner for closed-loop planning, which can effectively model multi-modal driving behavior and ensure trajectory quality without any rule-based refinement. Our model supports joint modeling of both prediction and planning tasks under the same architecture, enabling cooperative behaviors between vehicles. Moreover, by learning the gradient of the trajectory score function and employing a flexible classifier guidance mechanism, Diffusion Planner effectively achieves safe and adaptable planning behaviors. Evaluations on the large-scale real-world autonomous planning benchmark nuPlan and our newly collected 200-hour delivery-vehicle driving dataset demonstrate that Diffusion Planner achieves state-of-the-art closed-loop performance with robust transferability in diverse driving styles. Autonomous driving as a cornerstone technology, is poised to usher transportation into a safer and more efficient era of mobility (Tampuu et al., 2020). The key challenge is achieving human-like driving behaviors in complex open-world environment, while ensuring safety, efficiency, and comfort (Muhammad et al., 2020). Rule-based planning methods have demonstrated initial success in industrial applications (Fan et al., 2018), by defining driving behaviors and establishing boundaries derived from human knowledge. In contrast, learning-based planning methods acquire driving skills by cloning human driving behaviors from collected datasets (Caesar et al., 2021), a process made simpler through straightforward imitation learning losses. Additionally, the capabilities of these models can potentially be enhanced by scaling up training resources (Chen et al., 2023). Though promising, current learning-based planning methods still face several limitations.