Cheng, Yang
A Digital Twin Framework for Physical-Virtual Integration in V2X-Enabled Connected Vehicle Corridors
Wu, Keshu, Li, Pei, Cheng, Yang, Parker, Steven T., Ran, Bin, Noyce, David A., Ye, Xinyue
Transportation Cyber-Physical Systems (T-CPS) are critical in improving traffic safety, reliability, and sustainability by integrating computing, communication, and control in transportation systems. The connected vehicle corridor is at the forefront of this transformation, where Cellular Vehicle-to-Everything (C-V2X) technology facilitates real-time data exchange between infrastructure, vehicles, and road users. However, challenges remain in processing and synchronizing the vast V2X data from vehicles and roadside units, particularly when ensuring scalability, data integrity, and operational resilience. This paper presents a digital twin framework for T-CPS, developed from a real-world connected vehicle corridor to address these challenges. By leveraging C-V2X technology and real-time data from infrastructure, vehicles, and road users, the digital twin accurately replicates vehicle behaviors, signal phases, and traffic patterns within the CARLA simulation environment. This framework demonstrates high fidelity between physical and digital systems and ensures robust synchronization of vehicle trajectories and signal phases through extensive experiments. Moreover, the digital twin's scalable and redundant architecture enhances data integrity, making it capable of supporting future large-scale C-V2X deployments. The digital twin is a vital tool in T-CPS, enabling real-time traffic monitoring, prediction, and optimization to enhance the reliability and safety of transportation systems.
Real-World Data Inspired Interactive Connected Traffic Scenario Generation
You, Junwei, Li, Pei, Cheng, Yang, Wu, Keshu, Gan, Rui, Parker, Steven T., Ran, Bin
Simulation is a crucial step in ensuring accurate, efficient, and realistic Connected and Autonomous Vehicles (CAVs) testing and validation. As the adoption of CAV accelerates, the integration of real-world data into simulation environments becomes increasingly critical. Among various technologies utilized by CAVs, Vehicle-to-Everything (V2X) communication plays a crucial role in ensuring a seamless transmission of information between CAVs, infrastructure, and other road users. However, most existing studies have focused on developing and testing communication protocols, resource allocation strategies, and data dissemination techniques in V2X. There is a gap where real-world V2X data is integrated into simulations to generate diverse and high-fidelity traffic scenarios. To fulfill this research gap, we leverage real-world Signal Phase and Timing (SPaT) data from Roadside Units (RSUs) to enhance the fidelity of CAV simulations. Moreover, we developed an algorithm that enables Autonomous Vehicles (AVs) to respond dynamically to real-time traffic signal data, simulating realistic V2X communication scenarios. Such high-fidelity simulation environments can generate multimodal data, including trajectory, semantic camera, depth camera, and bird's eye view data for various traffic scenarios. The generated scenarios and data provide invaluable insights into AVs' interactions with traffic infrastructure and other road users. This work aims to bridge the gap between theoretical research and practical deployment of CAVs, facilitating the development of smarter and safer transportation systems.
Voltage-Controlled Magnetoelectric Devices for Neuromorphic Diffusion Process
Cheng, Yang, Shu, Qingyuan, Lee, Albert, He, Haoran, Zhu, Ivy, Suhail, Haris, Chen, Minzhang, Chen, Renhe, Wang, Zirui, Zhang, Hantao, Wang, Chih-Yao, Yang, Shan-Yi, Hsin, Yu-Chen, Shih, Cheng-Yi, Lee, Hsin-Han, Cheng, Ran, Pamarti, Sudhakar, Kou, Xufeng, Wang, Kang L.
Stochastic diffusion processes are pervasive in nature, from the seemingly erratic Brownian motion to the complex interactions of synaptically-coupled spiking neurons. Recently, drawing inspiration from Langevin dynamics, neuromorphic diffusion models were proposed and have become one of the major breakthroughs in the field of generative artificial intelligence. Unlike discriminative models that have been well developed to tackle classification or regression tasks, diffusion models as well as other generative models such as ChatGPT aim at creating content based upon contexts learned. However, the more complex algorithms of these models result in high computational costs using today's technologies, creating a bottleneck in their efficiency, and impeding further development. Here, we develop a spintronic voltage-controlled magnetoelectric memory hardware for the neuromorphic diffusion process. The in-memory computing capability of our spintronic devices goes beyond current Von Neumann architecture, where memory and computing units are separated. Together with the non-volatility of magnetic memory, we can achieve high-speed and low-cost computing, which is desirable for the increasing scale of generative models in the current era. We experimentally demonstrate that the hardware-based true random diffusion process can be implemented for image generation and achieve comparable image quality to software-based training as measured by the Frechet inception distance (FID) score, achieving ~10^3 better energy-per-bit-per-area over traditional hardware.
On Large Language Models' Hallucination with Regard to Known Facts
Jiang, Che, Qi, Biqing, Hong, Xiangyu, Fu, Dayuan, Cheng, Yang, Meng, Fandong, Yu, Mo, Zhou, Bowen, Zhou, Jie
Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual questions that query the same triplet knowledge but result in different answers. The difference between the model behaviors on the correct and incorrect outputs hence suggests the patterns when hallucinations happen. Second, to measure the pattern, we utilize mappings from the residual streams to vocabulary space. We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases. In hallucinated cases, the output token's information rarely demonstrates abrupt increases and consistent superiority in the later stages of the model. Leveraging the dynamic curve as a feature, we build a classifier capable of accurately detecting hallucinatory predictions with an 88\% success rate. Our study shed light on understanding the reasons for LLMs' hallucinations on their known facts, and more importantly, on accurately predicting when they are hallucinating.
Truck Parking Usage Prediction with Decomposed Graph Neural Networks
Tamaru, Rei, Cheng, Yang, Parker, Steven, Perry, Ernie, Ran, Bin, Ahn, Soyoung
Truck parking on freight corridors faces various challenges, such as insufficient parking spaces and compliance with Hour-of-Service (HOS) regulations. These constraints often result in unauthorized parking practices, causing safety concerns. To enhance the safety of freight operations, providing accurate parking usage prediction proves to be a cost-effective solution. Despite the existing research demonstrating satisfactory accuracy for predicting individual truck parking site usage, few approaches have been proposed for predicting usage with spatial dependencies of multiple truck parking sites. We present the Regional Temporal Graph Neural Network (RegT-GCN) as a predictive framework for assessing parking usage across the entire state to provide better truck parking information and mitigate unauthorized parking. The framework leverages the topological structures of truck parking site distributions and historical parking data to predict occupancy rates across a state. To achieve this, we introduce a Regional Decomposition approach, which effectively captures the geographical characteristics. We also introduce the spatial module working efficiently with the temporal module. Evaluation results demonstrate that the proposed model surpasses other baseline models, improving the performance by more than $20\%$ compared with the original model. The proposed model allows truck parking sites' percipience of the topological structures and provides higher performance.
BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training
Wang, Songtao, Li, Dan, Cheng, Yang, Geng, Jinkun, Wang, Yanshu, Wang, Shuai, Xia, Shu-Tao, Wu, Jianping
In distributed machine learning (DML), the network performance between machines significantly impacts the speed of iterative training. In this paper we propose BML, a new gradient synchronization algorithm with higher network performance and lower network cost than the current practice. BML runs on BCube network, instead of using the traditional Fat-Tree topology. BML algorithm is designed in such a way that, compared to the parameter server (PS) algorithm on a Fat-Tree network connecting the same number of server machines, BML achieves theoretically 1/k of the gradient synchronization time, with k/5 of switches (the typical number of k is 2∼4). Experiments of LeNet-5 and VGG-19 benchmarks on a testbed with 9 dual-GPU servers show that, BML reduces the job completion time of DML training by up to 56.4%.
BML: A High-performance, Low-cost Gradient Synchronization Algorithm for DML Training
Wang, Songtao, Li, Dan, Cheng, Yang, Geng, Jinkun, Wang, Yanshu, Wang, Shuai, Xia, Shu-Tao, Wu, Jianping
In distributed machine learning (DML), the network performance between machines significantly impacts the speed of iterative training. In this paper we propose BML, a new gradient synchronization algorithm with higher network performance and lower network cost than the current practice. BML runs on BCube network, instead of using the traditional Fat-Tree topology. BML algorithm is designed in such a way that, compared to the parameter server (PS) algorithm on a Fat-Tree network connecting the same number of server machines, BML achieves theoretically 1/k of the gradient synchronization time, with k/5 of switches (the typical number of k is 2∼4). Experiments of LeNet-5 and VGG-19 benchmarks on a testbed with 9 dual-GPU servers show that, BML reduces the job completion time of DML training by up to 56.4%.