HAD-Gen: Human-like and Diverse Driving Behavior Modeling for Controllable Scenario Generation

Wang, Cheng, Kong, Lingxin, Tamborski, Massimiliano, Albrecht, Stefano V.

arXiv.org Artificial Intelligence 

--Simulation-based testing has emerged as an essential tool for verifying and validating autonomous vehicles (A Vs). However, contemporary methodologies, such as deterministic and imitation learning-based driver models, struggle to capture the variability of human-like driving behavior . Given these challenges, we propose HAD-Gen, a general framework for realistic traffic scenario generation that simulates diverse human-like driving behaviors. It then employs maximum entropy inverse reinforcement learning on each of the clusters to learn the reward function corresponding to each driving style. Using these reward functions, the method integrates offline reinforcement learning pre-training and multi-agent reinforcement learning algorithms to obtain general and robust driving policies. Multi-perspective simulation results show that our proposed scenario generation framework can simulate diverse, human-like driving behaviors with strong generalization capability. The proposed framework achieves a 90.96% goal-reaching rate, an off-road rate of 2.08%, and a collision rate of 6.91% in the generalization test, outperforming prior approaches by over 20% in goal-reaching performance. UTONOMOUS vehicles (A Vs) represent a groundbreaking advancement in transportation technology, with the potential to enhance road safety, reduce traffic congestion, and improve overall mobility efficiency [1]. Over the past decade, significant progress has been made in transforming A Vs from theoretical concepts into practical implementations, with numerous prototypes successfully navigating urban environments [2]. Despite these advancements, ensuring the safety and reliability of A Vs across diverse real-world scenarios remains a significant challenge [3]. This work was funded by UK Research and Innovation (UKRI) under the UK government's Horizon Europe funding guarantee [grant number EP/Z533464/1]. Cheng Wang is with the School of Engineering and Physical Sciences, Heriot-Watt University, Edinburgh EH14 4AS, United Kingdom (e-mail: cheng.wang@hw.ac.uk). Lingxin Kong is with the School of Automation and Software Engineering, Shanxi University, Taiyuan 030031, China (e-mail: 202223601006@email.sxu.edu.cn).