Wang, Yuning
CTS-CBS: A New Approach for Multi-Agent Collaborative Task Sequencing and Path Finding
Jiang, Junkai, Li, Ruochen, Yang, Yibin, Chen, Yihe, Wang, Yuning, Xu, Shaobing, Wang, Jianqiang
This paper addresses a generalization problem of Multi-Agent Pathfinding (MAPF), called Collaborative Task Sequencing - Multi-Agent Pathfinding (CTS-MAPF), where agents must plan collision-free paths and visit a series of intermediate task locations in a specific order before reaching their final destinations. To address this problem, we propose a new approach, Collaborative Task Sequencing - Conflict-Based Search (CTS-CBS), which conducts a two-level search. In the high level, it generates a search forest, where each tree corresponds to a joint task sequence derived from the jTSP solution. In the low level, CTS-CBS performs constrained single-agent path planning to generate paths for each agent while adhering to high-level constraints. We also provide heoretical guarantees of its completeness and optimality (or sub-optimality with a bounded parameter). To evaluate the performance of CTS-CBS, we create two datasets, CTS-MAPF and MG-MAPF, and conduct comprehensive experiments. The results show that CTS-CBS adaptations for MG-MAPF outperform baseline algorithms in terms of success rate (up to 20 times larger) and runtime (up to 100 times faster), with less than a 10% sacrifice in solution quality. Furthermore, CTS-CBS offers flexibility by allowing users to adjust the sub-optimality bound omega to balance between solution quality and efficiency. Finally, practical robot tests demonstrate the algorithm's applicability in real-world scenarios.
Controllable Traffic Simulation through LLM-Guided Hierarchical Chain-of-Thought Reasoning
Liu, Zhiyuan, Li, Leheng, Wang, Yuning, Lin, Haotian, Liu, Zhizhe, He, Lei, Wang, Jianqiang
Evaluating autonomous driving systems in complex and diverse traffic scenarios through controllable simulation is essential to ensure their safety and reliability. However, existing traffic simulation methods face challenges in their controllability. To address this, this paper proposes a novel diffusion-based and LLM-enhanced traffic simulation framework. Our approach incorporates a unique chain-of-thought (CoT) mechanism, which systematically examines the hierarchical structure of traffic elements and guides LLMs to thoroughly analyze traffic scenario descriptions step by step, enhancing their understanding of complex situations. Furthermore, we propose a Frenet-frame-based cost function framework that provides LLMs with geometrically meaningful quantities, improving their grasp of spatial relationships in a scenario and enabling more accurate cost function generation. Experiments on the Waymo Open Motion Dataset (WOMD) demonstrate that our method handles more intricate descriptions, generates a broader range of scenarios in a controllable manner, and outperforms existing diffusion-based methods in terms of efficiency.
S2O: An Integrated Driving Decision-making Performance Evaluation Method Bridging Subjective Feeling to Objective Evaluation
Wang, Yuning, Ke, Zehong, Jiang, Yanbo, Li, Jinhao, Xu, Shaobing, Dolan, John M., Wang, Jianqiang
Autonomous driving decision-making is one of the critical modules towards intelligent transportation systems, and how to evaluate the driving performance comprehensively and precisely is a crucial challenge. A biased evaluation misleads and hinders decision-making modification and development. Current planning evaluation metrics include deviation from the real driver trajectory and objective driving experience indicators. The former category does not necessarily indicate good driving performance since human drivers also make errors and has been proven to be ineffective in interactive close-loop systems. On the other hand, existing objective driving experience models only consider limited factors, lacking comprehensiveness. And the integration mechanism of various factors relies on intuitive experience, lacking precision. In this research, we propose S2O, a novel integrated decision-making evaluation method bridging subjective human feeling to objective evaluation. First, modified fundamental models of four kinds of driving factors which are safety, time efficiency, comfort, and energy efficiency are established to cover common driving factors. Then based on the analysis of human rating distribution regularity, a segmental linear fitting model in conjunction with a complementary SVM segment classifier is designed to express human's subjective rating by objective driving factor terms. Experiments are conducted on the D2E dataset, which includes approximately 1,000 driving cases and 40,000 human rating scores. Results show that S2O achieves a mean absolute error of 4.58 to ground truth under a percentage scale. Compared with baselines, the evaluation error is reduced by 32.55%. Implementation on the SUMO platform proves the real-time efficiency of online evaluation, and validation on performance evaluation of three autonomous driving planning algorithms proves the feasibility.
A Generalized Control Revision Method for Autonomous Driving Safety
Zhu, Zehang, Wang, Yuning, Ke, Tianqi, Han, Zeyu, Xu, Shaobing, Xu, Qing, Dolan, John M., Wang, Jianqiang
Safety is one of the most crucial challenges of autonomous driving vehicles, and one solution to guarantee safety is to employ an additional control revision module after the planning backbone. Control Barrier Function (CBF) has been widely used because of its strong mathematical foundation on safety. However, the incompatibility with heterogeneous perception data and incomplete consideration of traffic scene elements make existing systems hard to be applied in dynamic and complex real-world scenarios. In this study, we introduce a generalized control revision method for autonomous driving safety, which adopts both vectorized perception and occupancy grid map as inputs and comprehensively models multiple types of traffic scene constraints based on a new proposed barrier function. Traffic elements are integrated into one unified framework, decoupled from specific scenario settings or rules. Experiments on CARLA, SUMO, and OnSite simulator prove that the proposed algorithm could realize safe control revision under complicated scenes, adapting to various planning backbones, road topologies, and risk types. Physical platform validation also verifies the real-world application feasibility.
PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network
Wang, Yuning, Liu, Zhiyuan, Lin, Haotian, Jiang, Junkai, Xu, Shaobing, Wang, Jianqiang
Scene understanding, defined as learning, extraction, and representation of interactions among traffic elements, is one of the critical challenges toward high-level autonomous driving (AD). Current scene understanding methods mainly focus on one concrete single task, such as trajectory prediction and risk level evaluation. Although they perform well on specific metrics, the generalization ability is insufficient to adapt to the real traffic complexity and downstream demand diversity. In this study, we propose PreGSU, a generalized pre-trained scene understanding model based on graph attention network to learn the universal interaction and reasoning of traffic scenes to support various downstream tasks. After the feature engineering and sub-graph module, all elements are embedded as nodes to form a dynamic weighted graph. Then, four graph attention layers are applied to learn the relationships among agents and lanes. In the pre-train phase, the understanding model is trained on two self-supervised tasks: Virtual Interaction Force (VIF) modeling and Masked Road Modeling (MRM). Based on the artificial potential field theory, VIF modeling enables PreGSU to capture the agent-to-agent interactions while MRM extracts agent-to-road connections. In the fine-tuning process, the pre-trained parameters are loaded to derive detailed understanding outputs. We conduct validation experiments on two downstream tasks, i.e., trajectory prediction in urban scenario, and intention recognition in highway scenario, to verify the generalized ability and understanding ability. Results show that compared with the baselines, PreGSU achieves better accuracy on both tasks, indicating the potential to be generalized to various scenes and targets. Ablation study shows the effectiveness of pre-train task design.
D2E-An Autonomous Decision-making Dataset involving Driver States and Human Evaluation
Ke, Zehong, Jiang, Yanbo, Wang, Yuning, Cheng, Hao, Li, Jinhao, Wang, Jianqiang
With the advancement of deep learning technology, data-driven methods are increasingly used in the decision-making of autonomous driving, and the quality of datasets greatly influenced the model performance. Although current datasets have made significant progress in the collection of vehicle and environment data, emphasis on human-end data including the driver states and human evaluation is not sufficient. In addition, existing datasets consist mostly of simple scenarios such as car following, resulting in low interaction levels. In this paper, we introduce the Driver to Evaluation dataset (D2E), an autonomous decision-making dataset that contains data on driver states, vehicle states, environmental situations, and evaluation scores from human reviewers, covering a comprehensive process of vehicle decision-making. Apart from regular agents and surrounding environment information, we not only collect driver factor data including first-person view videos, physiological signals, and eye attention data, but also provide subjective rating scores from 40 human volunteers. The dataset is mixed of driving simulator scenes and real-road ones. High-interaction situations are designed and filtered to ensure behavior diversity. Through data organization, analysis, and preprocessing, D2E contains over 1100 segments of interactive driving case data covering from human driver factor to evaluation results, supporting the development of data-driven decision-making related algorithms.
Integrating Wearable Sensor Data and Self-reported Diaries for Personalized Affect Forecasting
Yang, Zhongqi, Wang, Yuning, Yamashita, Ken S., Sabah, Maryam, Khatibi, Elahe, Azimi, Iman, Dutt, Nikil, Borelli, Jessica L., Rahmani, Amir M.
Emotional states, as indicators of affect, are pivotal to overall health, making their accurate prediction before onset crucial. Current studies are primarily centered on immediate short-term affect detection using data from wearable and mobile devices. These studies typically focus on objective sensory measures, often neglecting other forms of self-reported information like diaries and notes. In this paper, we propose a multimodal deep learning model for affect status forecasting. This model combines a transformer encoder with a pre-trained language model, facilitating the integrated analysis of objective metrics and self-reported diaries. To validate our model, we conduct a longitudinal study, enrolling college students and monitoring them over a year, to collect an extensive dataset including physiological, environmental, sleep, metabolic, and physical activity parameters, alongside open-ended textual diaries provided by the participants. Our results demonstrate that the proposed model achieves predictive accuracy of 82.50% for positive affect and 82.76% for negative affect, a full week in advance. The effectiveness of our model is further elevated by its explainability.
Physics-informed deep-learning applications to experimental fluid mechanics
Eivazi, Hamidreza, Wang, Yuning, Vinuesa, Ricardo
High-resolution reconstruction of flow-field data from low-resolution and noisy measurements is of interest due to the prevalence of such problems in experimental fluid mechanics, where the measurement data are in general sparse, incomplete and noisy. Deep-learning approaches have been shown suitable for such super-resolution tasks. However, a high number of high-resolution examples is needed, which may not be available for many cases. Moreover, the obtained predictions may lack in complying with the physical principles, e.g. mass and momentum conservation. Physics-informed deep learning provides frameworks for integrating data and physical laws for learning. In this study, we apply physics-informed neural networks (PINNs) for super-resolution of flow-field data both in time and space from a limited set of noisy measurements without having any high-resolution reference data. Our objective is to obtain a continuous solution of the problem, providing a physically-consistent prediction at any point in the solution domain. We demonstrate the applicability of PINNs for the super-resolution of flow-field data in time and space through three canonical cases: Burgers' equation, two-dimensional vortex shedding behind a circular cylinder and the minimal turbulent channel flow. The robustness of the models is also investigated by adding synthetic Gaussian noise. Furthermore, we show the capabilities of PINNs to improve the resolution and reduce the noise in a real experimental dataset consisting of hot-wire-anemometry measurements. Our results show the adequate capabilities of PINNs in the context of data augmentation for experiments in fluid mechanics.
A Risk-aware Planning Framework of UGVs in Off-Road Environment
Jiang, Junkai, Hu, Zhenhua, Xie, Zihan, Hao, Changlong, Liu, Hongyu, Xu, Wenliang, Wang, Yuning, He, Lei, Xu, Shaobing, Wang, Jianqiang
Planning module is an essential component of intelligent vehicle study. In this paper, we address the risk-aware planning problem of UGVs through a global-local planning framework which seamlessly integrates risk assessment methods. In particular, a global planning algorithm named Coarse2fine A* is proposed, which incorporates a potential field approach to enhance the safety of the planning results while ensuring the efficiency of the algorithm. A deterministic sampling method for local planning is leveraged and modified to suit off-road environment. It also integrates a risk assessment model to emphasize the avoidance of local risks. The performance of the algorithm is demonstrated through simulation experiments by comparing it with baseline algorithms, where the results of Coarse2fine A* are shown to be approximately 30% safer than those of the baseline algorithms. The practicality and effectiveness of the proposed planning framework are validated by deploying it on a real-world system consisting of a control center and a practical UGV platform.
$\beta$-Variational autoencoders and transformers for reduced-order modelling of fluid flows
Solera-Rico, Alberto, Vila, Carlos Sanmiguel, Gómez, M. A., Wang, Yuning, Almashjary, Abdulrahman, Dawson, Scott T. M., Vinuesa, Ricardo
Variational autoencoder (VAE) architectures have the potential to develop reduced-order models (ROMs) for chaotic fluid flows. We propose a method for learning compact and near-orthogonal ROMs using a combination of a $\beta$-VAE and a transformer, tested on numerical data from a two-dimensional viscous flow in both periodic and chaotic regimes. The $\beta$-VAE is trained to learn a compact latent representation of the flow velocity, and the transformer is trained to predict the temporal dynamics in latent space. Using the $\beta$-VAE to learn disentangled representations in latent-space, we obtain a more interpretable flow model with features that resemble those observed in the proper orthogonal decomposition, but with a more efficient representation. Using Poincar\'e maps, the results show that our method can capture the underlying dynamics of the flow outperforming other prediction models. The proposed method has potential applications in other fields such as weather forecasting, structural dynamics or biomedical engineering.