Goto

Collaborating Authors

 road environment


A Siamese Transformer with Hierarchical Refinement for Lane Detection

Neural Information Processing Systems

Lane detection is an important yet challenging task in autonomous driving systems. Existing lane detection methods mainly rely on finer-scale information to identify key points of lane lines. Since local information in realistic road environments is frequently obscured by other vehicles or affected by poor outdoor lighting conditions, these methods struggle with the regression of such key points. In this paper, we propose a novel Siamese Transformer with hierarchical refinement for lane detection to improve the detection accuracy in complex road environments. Specifically, we propose a high-to-low hierarchical refinement Transformer structure, called LAne TRansformer (LATR), to refine the key points of lane lines, which integrates global semantics information and finer-scale features. Moreover, exploiting the thin and long characteristics of lane lines, we propose a novel Curve-IoU loss to supervise the fit of lane lines. Extensive experiments on three benchmark datasets of lane detection demonstrate that our proposed new method achieves state-of-the-art results with high accuracy and efficiency. Specifically, our method achieves improved F1 scores on the OpenLane dataset, surpassing the current best-performing method by 5.0 points.


A Siamese Transformer with Hierarchical Refinement for Lane Detection

Neural Information Processing Systems

Lane detection is an important yet challenging task in autonomous driving systems. Existing lane detection methods mainly rely on finer-scale information to identify key points of lane lines. Since local information in realistic road environments is frequently obscured by other vehicles or affected by poor outdoor lighting conditions, these methods struggle with the regression of such key points. In this paper, we propose a novel Siamese Transformer with hierarchical refinement for lane detection to improve the detection accuracy in complex road environments. Specifically, we propose a high-to-low hierarchical refinement Transformer structure, called LAne TRansformer (LATR), to refine the key points of lane lines, which integrates global semantics information and finer-scale features.


Camera Agnostic Two-Head Network for Ego-Lane Inference

arXiv.org Artificial Intelligence

Vision-based ego-lane inference using High-Definition (HD) maps is essential in autonomous driving and advanced driver assistance systems. The traditional approach necessitates well-calibrated cameras, which confines variation of camera configuration, as the algorithm relies on intrinsic and extrinsic calibration. In this paper, we propose a learning-based ego-lane inference by directly estimating the ego-lane index from a single image. To enhance robust performance, our model incorporates the two-head structure inferring ego-lane in two perspectives simultaneously. Furthermore, we utilize an attention mechanism guided by vanishing point-and-line to adapt to changes in viewpoint without requiring accurate calibration. The high adaptability of our model was validated in diverse environments, devices, and camera mounting points and orientations.


RedMotion: Motion Prediction via Redundancy Reduction

arXiv.org Artificial Intelligence

Predicting the future motion of traffic agents is vital for self-driving vehicles to ensure their safe operation. We introduce RedMotion, a transformer model for motion prediction that incorporates two types of redundancy reduction. The first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of road environment tokens, such as road graphs with agent data, to a fixed-sized embedding. The second type of redundancy reduction is a self-supervised learning objective and applies the redundancy reduction principle to embeddings generated from augmented views of road environments. Our experiments reveal that our representation learning approach can outperform PreTraM, Traj-MAE, and GraphDINO in a semi-supervised setting. Our RedMotion model achieves results that are competitive with those of Scene Transformer or MTR++. We provide an open source implementation that is accessible via GitHub and Colab. It is essential for self-driving vehicles to understand the relation between the motion of traffic agents and the surrounding road environment. Motion prediction aims to predict the future trajectory of traffic agents based on past trajectories and the given traffic scenario. Recent state-of-the-art methods (e.g., Shi et al. (2022); Wang et al. (2023); Nayakanti et al. (2023)) are deep learning methods trained using supervised learning.


CV2X-LOCA: Roadside Unit-Enabled Cooperative Localization Framework for Autonomous Vehicles

arXiv.org Artificial Intelligence

An accurate and robust localization system is crucial for autonomous vehicles (AVs) to enable safe driving in urban scenes. While existing global navigation satellite system (GNSS)-based methods are effective at locating vehicles in open-sky regions, achieving high-accuracy positioning in urban canyons such as lower layers of multi-layer bridges, streets beside tall buildings, tunnels, etc., remains a challenge. In this paper, we investigate the potential of cellular-vehicle-to-everything (C-V2X) wireless communications in improving the localization performance of AVs under GNSS-denied environments. Specifically, we propose the first roadside unit (RSU)-enabled cooperative localization framework, namely CV2X-LOCA, that only uses C-V2X channel state information to achieve lane-level positioning accuracy. CV2X-LOCA consists of four key parts: data processing module, coarse positioning module, environment parameter correcting module, and vehicle trajectory filtering module. These modules jointly handle challenges present in dynamic C-V2X networks. Extensive simulation and field experiments show that CV2X-LOCA achieves state-of-the-art performance for vehicle localization even under noisy conditions with high-speed movement and sparse RSUs coverage environments. The study results also provide insights into future investment decisions for transportation agencies regarding deploying RSUs cost-effectively.


Hybrid tracker based optimal path tracking system for complex road environments for autonomous driving

arXiv.org Artificial Intelligence

Path tracking system plays a key technology in autonomous driving. The system should be driven accurately along the lane and be careful not to cause any inconvenience to passengers. To address such tasks, this paper proposes hybrid tracker based optimal path tracking system. By applying a deep learning based lane detection algorithm and a designated fast lane fitting algorithm, this paper developed a lane processing algorithm that shows a match rate with actual lanes with minimal computational cost. In addition, three modified path tracking algorithms were designed using the GPS based path or the vision based path. In the driving system, a match rate for the correct ideal path does not necessarily represent driving stability. This paper proposes hybrid tracker based optimal path tracking system by applying the concept of an observer that selects the optimal tracker appropriately in complex road environments. The driving stability has been studied in complex road environments such as straight road with multiple 3-way junctions, roundabouts, intersections, and tunnels. Consequently, the proposed system experimentally showed the high performance with consistent driving comfort by maintaining the vehicle within the lanes accurately even in the presence of high complexity of road conditions. Code will be available in https://github.com/DGIST-ARTIV.


V2I Connectivity-Based Dynamic Queue-Jump Lane for Emergency Vehicles: A Deep Reinforcement Learning Approach

arXiv.org Artificial Intelligence

Emergency vehicle (EMV) service is a key function of cities and is exceedingly challenging due to urban traffic congestion. A main reason behind EMV service delay is the lack of communication and cooperation between vehicles blocking EMVs. In this paper, we study the improvement of EMV service under V2I connectivity. We consider the establishment of dynamic queue jump lanes (DQJLs) based on real-time coordination of connected vehicles. We develop a novel Markov decision process formulation for the DQJL problem, which explicitly accounts for the uncertainty of drivers' reaction to approaching EMVs. We propose a deep neural network-based reinforcement learning algorithm that efficiently computes the optimal coordination instructions. We also validate our approach on a micro-simulation testbed using Simulation of Urban Mobility (SUMO). Validation results show that with our proposed methodology, the centralized control system saves approximately 15\% EMV passing time than the benchmark system.


Driverless car tests move step closer to West Midlands roads

#artificialintelligence

THE testing of driverless cars on West Midlands roads looks set to move a step closer after planning applications were submitted for infrastructure to support the technology. Back in 2018 it was announced that the West Midlands had won a national competition to become the UK's first testbed for Connected and Autonomous Vehicles (CAVs). More popularly known as'driverless cars', the testbed will provide the infrastructure for the testing of the technology across Birmingham, Coventry and Solihull. And now it looks as though introduction of the technology to the region's move has moved one step closer. Planning applications for two CAV masts have been put forward by Transport for West Midlands (TfWM), who are responsible for running the project.


Real-time Multi-target Path Prediction and Planning for Autonomous Driving aided by FCN

arXiv.org Artificial Intelligence

Real-time multi-target path planning is a key issue in the field of autonomous driving. Although multiple paths can be generated in real-time with polynomial curves, the generated paths are not flexible enough to deal with complex road scenes such as S-shaped road and unstructured scenes such as parking lots. Search and sampling-based methods, such as A* and RRT and their derived methods, are flexible in generating paths for these complex road environments. However, the existing algorithms require significant time to plan to multiple targets, which greatly limits their application in autonomous driving. In this paper, a real-time path planning method for multi-targets is proposed. We train a fully convolutional neural network (FCN) to predict a path region for the target at first. By taking the predicted path region as soft constraints, the A* algorithm is then applied to search the exact path to the target. Experiments show that FCN can make multiple predictions in a very short time (50 times in 40ms), and the predicted path region effectively restrict the searching space for the following A* search. Therefore, the A* can search much faster so that the multi-target path planning can be achieved in real-time (3 targets in less than 100ms).


Foresee: Attentive Future Projections of Chaotic Road Environments with Online Training

arXiv.org Machine Learning

Abstract--In this paper, we train a recurrent neural network to learn dynamics of a chaotic road environment and to project the future of the environment on an image. Future projection can be used to anticipate an unseen environment for example, in autonomous driving. Road environment is highly dynamic and complex due to the interaction among traffic participants such as vehicles and pedestrians. Even in this complex environment, a human driver is efficacious to safely drive on chaotic roads irrespective of the number of traffic participants. The proliferation of deep learning research has shown the efficacy of neural networks in learning this human behavior . In the same direction, we investigate recurrent neural networks to understand the chaotic road environment which is shared by pedestrians, vehicles (cars, trucks, bicycles etc.), and sometimes animals as well. We propose Foresee, a unidirectional gated recurrent units (GRUs) network with attention to project future of the environment in the form of images. We have collected several videos on Delhi roads consisting of various traffic participants, background and infrastructure differences (like 3D pedestrian crossing) at various times on various days. We train Foresee in an unsupervised way and we use online training to project frames up to 0 . We show that our proposed model performs better than state of the art methods (prednet [20], Enc. Dec. LSTM [28]) and finally, we show that our trained model generalizes to a public dataset for future projections. Environment anticipation is an important task for situation awareness and decision making.