Purpose in the Machine: Do Traffic Simulators Produce Distributionally Equivalent Outcomes for Reinforcement Learning Applications?
Chen, Rex, Carley, Kathleen M., Fang, Fei, Sadeh, Norman
–arXiv.org Artificial Intelligence
ABSTRACT Traffic simulators are used to generate data for learning in intelligent transportation systems (ITSs). A key question is to what extent their modelling assumptions affect the capabilities of ITSs to adapt to various scenarios when deployed in the real world. This work focuses on two simulators commonly used to train reinforcement learning (RL) agents for traffic applications, CityFlow and SUMO. A controlled virtual experiment varying driver behavior and simulation scale finds evidence against distributional equivalence in RL-relevant measures from these simulators, with the root mean squared error and KL divergence being significantly greater than 0 for all assessed measures. While granular real-world validation generally remains infeasible, these findings suggest that traffic simulators are not a deus ex machina for RL training: understanding the impacts of inter-simulator differences is necessary to train and deploy RL-based ITSs. 1 INTRODUCTION Transportation efficiency is becoming an increasingly critical challenge due to continual growth in the volume of people and objects that need to be transported. The 2021 Urban Mobility Report (Schrank et al. 2021) projected that, while the COVID-19 pandemic alleviated congestion, traffic levels in the US will quickly rebound in areas with expanding populations and job markets to produce the most rapid congestion growth since 1982. The increased traffic will stress existing infrastructure and result in social, economic, and environmental costs (Schrank et al. 2021), thus making the development and deployment of intelligent transportation systems (ITSs) a critical priority. At the same time, advances in computational algorithms and roadway infrastructure made in response to these challenges provide opportunities to enhance ITS learning. For example, novel traffic signal control technologies based on reinforcement learning (RL), which learn adaptive signaling policies from simulations generated using real-world traffic data, have already achieved performance on par with and even exceeding traditional control methods (Chen et al. 2020). However, collecting data for ITS learning remains a nontrivial task.
arXiv.org Artificial Intelligence
Nov-13-2023
- Country:
- Europe (1.00)
- North America
- Canada
- United States
- California (0.14)
- Pennsylvania > Allegheny County
- Pittsburgh (0.14)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Transportation
- Ground > Road (1.00)
- Infrastructure & Services (1.00)
- Transportation
- Technology: