Benchmarking Lane-changing Decision-making for Deep Reinforcement Learning

Wang, Junjie, Zhang, Qichao, Zhao, Dongbin

arXiv.org Artificial Intelligence 

It is expected that by 2050, the application of this technology can reduce vehicle emissions by 50%, and the road traffic casualty rate will be close to zero [1]. For industry players, the main testing method is the real vehicle road test. However, Kalra et al. [2] of RAND Corporation conclude that at the 95% confidence level, road testing of more than 14.2 billion km is required to prove that the fatality rate of autonomous vehicles is 20% lower than that of human drivers. Therefore, virtual testing will be the primary way of validation and verification of autonomous vehicles. Reinforcement Learning (RL) agents learn by interacting with the environment, adjust their policy by obtaining rewards, and maximize the reward function by balancing exploration and exploitation, expecting to find the optimal policy corresponding to the maximum cumulative reward [3]. Deep Reinforcement Learning (DRL), combining the perception capability of Deep Learning (DL) and the decision-making capability of RL [4], is suitable for solving the autonomous driving decision-making problem, which is a typical application of timeseries decisions in a complex environment. Many existing studies apply DRL to the intersection [5], lane changing [6], [7] scenarios, etc. Still, to the best of our knowledge, there is no standardized system for training and testing scenarios, evaluation metrics, and baseline methods performance comparisons.