He, Zhengbing
CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting
Liao, Haicheng, Kong, Hanlin, Wang, Bonan, Wang, Chengyue, Ye, Wang, He, Zhengbing, Xu, Chengzhong, Li, Zhenning
Accurate motion forecasting is crucial for safe autonomous driving (AD). This study proposes CoT-Drive, a novel approach that enhances motion forecasting by leveraging large language models (LLMs) and a chain-of-thought (CoT) prompting method. We introduce a teacher-student knowledge distillation strategy to effectively transfer LLMs' advanced scene understanding capabilities to lightweight language models (LMs), ensuring that CoT-Drive operates in real-time on edge devices while maintaining comprehensive scene understanding and generalization capabilities. By leveraging CoT prompting techniques for LLMs without additional training, CoT-Drive generates semantic annotations that significantly improve the understanding of complex traffic environments, thereby boosting the accuracy and robustness of predictions. Additionally, we present two new scene description datasets, Highway-Text and Urban-Text, designed for fine-tuning lightweight LMs to generate context-specific semantic annotations. Comprehensive evaluations of five real-world datasets demonstrate that CoT-Drive outperforms existing models, highlighting its effectiveness and efficiency in handling complex traffic scenarios. Overall, this study is the first to consider the practical application of LLMs in this field. It pioneers the training and use of a lightweight LLM surrogate for motion forecasting, setting a new benchmark and showcasing the potential of integrating LLMs into AD systems.
NeuralMOVES: A lightweight and microscopic vehicle emission estimation model based on reverse engineering and surrogate learning
Ramirez-Sanchez, Edgar, Tang, Catherine, Xu, Yaosheng, Renganathan, Nrithya, Jayawardana, Vindula, He, Zhengbing, Wu, Cathy
This significant contribution makes it a critical sector for climate change mitigation, as reducing emissions from transportation is essential for achieving global climate goals. The sector's transformation through electrification, automation, and intelligent infrastructure offers promising avenues for substantial emissions reductions (Sciarretta et al., 2020; International Energy Agency, 2023; McKinsey Center for Future Mobility, 2023). However, the success of these innovations is critically dependent on the availability of suitable and accurate emission estimation models to guide the design and deployment of new technologies. Motor Vehicle Emission Simulation (MOVES) (U.S. Environmental Protection Agency, 2022), one of the most well-established emission estimation models, serves as the official and state-of-the-art emission estimation model in the U.S., provided, enforced, and maintained by the U.S. Environmental Protection Agency (EPA). Despite its technical certification, MOVES' processing and software is tailored for two specific governmental uses: State Implementation Plans and Conformity Analyses U.S. Environmental Protection Agency (2021), which are for states to achieve and maintain air quality standards; and its use beyond trained practitioners and these specific analyses poses two main limitations. First, a steep learning curve, computational demands, and complex inputs make it difficult for researchers and practitioners to use. In particular, MOVES has rigid input requirements, including a combination of toggle-based settings within its GUI and structured input files in specific formats. Second, MOVES is tailored for macroscopic analysis and is unsuitable for microscopic applications, such as control and optimization, which commonly require second-by-second emission calculations for individual actions and vehicles.
AI-Driven Day-to-Day Route Choice
Wang, Leizhen, Duan, Peibo, He, Zhengbing, Lyu, Cheng, Chen, Xin, Zheng, Nan, Yao, Li, Ma, Zhenliang
Understanding individual travel behaviors is critical for developing efficient and sustainable transportation systems. Travel behavioral analysis aims to capture the decision-making process of individual travel execution, including travel route choice, travel mode choice, departure time choice, and trip purpose. Among these choices, modeling route choice not only helps analyze and understand travelers' behaviors, but also constitutes the essential part of traffic assignment methods [1]. Specifically, it enables the evaluation of travelers' perceptions of route characteristics, the forecasting of behavior in hypothetical scenarios, the prediction of future traffic dynamics on transportation networks, and the understanding of travelers' responses to travel information. Real-world route choice is complex because of the inherent difficulties in accurately representing human behavior, travelers' limited knowledge of network composition, uncertainties in perceptions of route characteristics, and the lack of precise information about travelers' preferences [1]. To overcome these limitations, DTD traffic dynamics have attracted significant attention since they focus on drivers' dynamic shifts in route choices and the evolution of traffic flow over time, rather than merely static equilibrium states. DTD models are flexible to incorporate diverse behavioral rules such as forecasting [2, 3], bounded rationality [4, 5], decision-making based on prospects [6, 7], marginal utility effects [8, 9], and social interactions [10]. Despite these advantages identified in [11] and [12], DTD models still struggle to accurately reflect the observed fluctuations in traffic dynamics, particularly the persistent deviations around User Equilibrium (UE) noted in empirical studies [13, 14, 15]. To better understand traffic dynamics, Agent-Based Modeling (ABM) offers a promising alternative.
A Survey on Large Language Model-empowered Autonomous Driving
Zhu, Yuxuan, Wang, Shiyi, Zhong, Wenqing, Shen, Nianchen, Li, Yunqi, Wang, Siqi, Li, Zhiheng, Wu, Cathy, He, Zhengbing, Li, Li
Artificial intelligence (AI) plays a crucial role in autonomous driving (AD) research, propelling its development towards intelligence and efficiency. Currently, the development of AD technology follows two main technical paths: modularization and end-to-end. Modularization decompose the driving task into modules such as perception, prediction, planning, and control, and train them separately. Due to the inconsistency of training objectives between modules, the integrated effect suffers from bias. End-to-end attempts to address this issue by utilizing a single model that directly maps from sensor data to control signals. This path has limited learning capabilities in a comprehensive set of features and struggles to handle unpredictable long-tail events and complex urban traffic scenarios. In the face of challenges encountered in both paths, many researchers believe that large language models (LLMs) with powerful reasoning capabilities and extensive knowledge understanding may be the solution, expecting LLMs to provide AD systems with deeper levels of understanding and decision-making capabilities. In light of the challenges faced by both paths, many researchers believe that LLMs, with their powerful reasoning abilities and extensive knowledge, could offer a solution. To understand if LLMs could enhance AD, this paper conducts a thorough analysis of the potential applications of LLMs in AD systems, including exploring their optimization strategies in both modular and end-to-end approaches, with a particular focus on how LLMs can tackle the problems and challenges present in current solutions. Furthermore, we discuss an important question: Can LLM-based artificial general intelligence (AGI) be a key to achieve high-level AD? We further analyze the potential limitations and challenges that LLMs may encounter in promoting the development of AD technology.
Large-Scale Traffic Data Imputation with Spatiotemporal Semantic Understanding
Zhang, Kunpeng, Wu, Lan, Zheng, Liang, Xie, Na, He, Zhengbing
Large-scale data missing is a challenging problem in Intelligent Transportation Systems (ITS). Many studies have been carried out to impute large-scale traffic data by considering their spatiotemporal correlations at a network level. In existing traffic data imputations, however, rich semantic information of a road network has been largely ignored when capturing network-wide spatiotemporal correlations. This study proposes a Graph Transformer for Traffic Data Imputation (GT-TDI) model to impute large-scale traffic data with spatiotemporal semantic understanding of a road network. Specifically, the proposed model introduces semantic descriptions consisting of network-wide spatial and temporal information of traffic data to help the GT-TDI model capture spatiotemporal correlations at a network level. The proposed model takes incomplete data, the social connectivity of sensors, and semantic descriptions as input to perform imputation tasks with the help of Graph Neural Networks (GNN) and Transformer. On the PeMS freeway dataset, extensive experiments are conducted to compare the proposed GT-TDI model with conventional methods, tensor factorization methods, and deep learning-based methods. The results show that the proposed GT-TDI outperforms existing methods in complex missing patterns and diverse missing rates. The code of the GT-TDI model will be available at https://github.com/KP-Zhang/GT-TDI.
Predicting Station-level Hourly Demands in a Large-scale Bike-sharing Network: A Graph Convolutional Neural Network Approach
Lin, Lei, He, Zhengbing, Peeta, Srinivas, Wen, Xuejin
Bike sharing is a vital piece in a modern multi-modal transportation system. However, it suffers from the bike unbalancing problem due to fluctuating spatial and temporal demands. Accurate bike sharing demand predictions can help operators to make optimal routes and schedules for bike redistributions, and therefore enhance the system efficiency. In this study, we propose a novel Graph Convolutional Neural Network with Data-driven Graph Filter (GCNN-DDGF) model to predict station-level hourly demands in a large-scale bike-sharing network. With each station as a vertex in the network, the new proposed GCNN-DDGF model is able to automatically learn the hidden correlations between stations, and thus overcomes a common issue reported in the previous studies, i.e., the quality and performance of GCNN models rely on the predefinition of the adjacency matrix. To show the performance of the proposed model, this study compares the GCNN-DDGF model with four GCNNs models, whose adjacency matrices are from different bike sharing system matrices including the Spatial Distance matrix (SD), the Demand matrix (DE), the Average Trip Duration matrix (ATD) and the Demand Correlation matrix (DC), respectively. The five types of GCNN models and the classic Support Vector Regression model are built on a Citi Bike dataset from New York City which includes 272 stations and over 28 million transactions from 2013 to 2016. Results show that the GCNN-DDGF model has the lowest Root Mean Square Error, followed by the GCNN-DC model, and the GCNN-ATD model has the worst performance. Through a further examination, we find the learned DDGF captures some similar information embedded in the SD, DE and DC matrices, and it also uncovers more hidden heterogeneous pairwise correlations between stations that are not revealed by any of those matrices.
Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction
Ma, Xiaolei, Dai, Zhuang, He, Zhengbing, Na, Jihui, Wang, Yong, Wang, Yunpeng
This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.