AITopics | Gu, Weihao

Collaborating Authors

Gu, Weihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

Zheng, Yinan, Liang, Ruiming, Zheng, Kexin, Zheng, Jinliang, Mao, Liyuan, Li, Jianxiong, Gu, Weihao, Ai, Rui, Li, Shengbo Eben, Zhan, Xianyuan, Liu, Jingjing

arXiv.org Artificial IntelligenceFeb-9-2025

Achieving human-like driving behaviors in complex open-world environments is a critical challenge in autonomous driving. Contemporary learning-based planning approaches such as imitation learning methods often struggle to balance competing objectives and lack of safety assurance, due to limited adaptability and inadequacy in learning complex multi-modal behaviors commonly exhibited in human planning, not to mention their strong reliance on the fallback strategy with predefined rules. We propose a novel transformer-based Diffusion Planner for closed-loop planning, which can effectively model multi-modal driving behavior and ensure trajectory quality without any rule-based refinement. Our model supports joint modeling of both prediction and planning tasks under the same architecture, enabling cooperative behaviors between vehicles. Moreover, by learning the gradient of the trajectory score function and employing a flexible classifier guidance mechanism, Diffusion Planner effectively achieves safe and adaptable planning behaviors. Evaluations on the large-scale real-world autonomous planning benchmark nuPlan and our newly collected 200-hour delivery-vehicle driving dataset demonstrate that Diffusion Planner achieves state-of-the-art closed-loop performance with robust transferability in diverse driving styles. Autonomous driving as a cornerstone technology, is poised to usher transportation into a safer and more efficient era of mobility (Tampuu et al., 2020). The key challenge is achieving human-like driving behaviors in complex open-world environment, while ensuring safety, efficiency, and comfort (Muhammad et al., 2020). Rule-based planning methods have demonstrated initial success in industrial applications (Fan et al., 2018), by defining driving behaviors and establishing boundaries derived from human knowledge. In contrast, learning-based planning methods acquire driving skills by cloning human driving behaviors from collected datasets (Caesar et al., 2021), a process made simpler through straightforward imitation learning losses. Additionally, the capabilities of these models can potentially be enhanced by scaling up training resources (Chen et al., 2023). Though promising, current learning-based planning methods still face several limitations.

artificial intelligence, machine learning, vehicle, (16 more...)

arXiv.org Artificial Intelligence

2501.15564

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles

Wang, Xiao, Tang, Ke, Dai, Xingyuan, Xu, Jintao, Du, Quancheng, Ai, Rui, Wang, Yuxiao, Gu, Weihao

arXiv.org Artificial IntelligenceApr-18-2024

In public roads, autonomous vehicles (AVs) face the challenge of frequent interactions with human-driven vehicles (HDVs), which render uncertain driving behavior due to varying social characteristics among humans. To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and safety-sensitive trajectory planning (S4TP) framework. Specifically, S4TP integrates the Social-Aware Trajectory Prediction (SATP) and Social-Aware Driving Risk Field (SADRF) modules. SATP utilizes Transformers to effectively encode the driving scene and incorporates an AV's planned trajectory during the prediction decoding process. SADRF assesses the expected surrounding risk degrees during AVs-HDVs interactions, each with different social characteristics, visualized as two-dimensional heat maps centered on the AV. SADRF models the driving intentions of the surrounding HDVs and predicts trajectories based on the representation of vehicular interactions. S4TP employs an optimization-based approach for motion planning, utilizing the predicted HDVs'trajectories as input. With the integration of SADRF, S4TP executes real-time online optimization of the planned trajectory of AV within lowrisk regions, thus improving the safety and the interpretability of the planned trajectory. We have conducted comprehensive tests of the proposed method using the SMARTS simulator. Experimental results in complex social scenarios, such as unprotected left turn intersections, merging, cruising, and overtaking, validate the superiority of our proposed S4TP in terms of safety and rationality. S4TP achieves a pass rate of 100% across all scenarios, surpassing the current state-of-the-art methods Fanta of 98.25% and Predictive-Decision of 94.75%.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIV.2023.3338483

2404.11946

Country: Asia > China (0.96)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition

Xie, Weidong, Luo, Lun, Ye, Nanfei, Ren, Yi, Du, Shaoyi, Wang, Minhang, Xu, Jintao, Ai, Rui, Gu, Weihao, Chen, Xieyuanli

arXiv.org Artificial IntelligenceMar-27-2024

Place recognition is an important task for robots and autonomous cars to localize themselves and close loops in pre-built maps. While single-modal sensor-based methods have shown satisfactory performance, cross-modal place recognition that retrieving images from a point-cloud database remains a challenging problem. Current cross-modal methods transform images into 3D points using depth estimation for modality conversion, which are usually computationally intensive and need expensive labeled data for depth supervision. In this work, we introduce a fast and lightweight framework to encode images and point clouds into place-distinctive descriptors. We propose an effective Field of View (FoV) transformation module to convert point clouds into an analogous modality as images. This module eliminates the necessity for depth estimation and helps subsequent modules achieve real-time performance. We further design a non-negative factorization-based encoder to extract mutually consistent semantic features between point clouds and images. This encoder yields more distinctive global descriptors for retrieval. Experimental results on the KITTI dataset show that our proposed methods achieve state-of-the-art performance while running in real time. Additional evaluation on the HAOMO dataset covering a 17 km trajectory further shows the practical generalization capabilities. We have released the implementation of our methods as open source at: https://github.com/haomo-ai/ModaLink.git.

artificial intelligence, image understanding, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.18762

Country: Asia > China > Shaanxi Province (0.14)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.56)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.48)

Add feedback

OverlapTransformer: An Efficient and Rotation-Invariant Transformer Network for LiDAR-Based Place Recognition

Ma, Junyi, Zhang, Jun, Xu, Jintao, Ai, Rui, Gu, Weihao, Chen, Xieyuanli

arXiv.org Artificial IntelligenceApr-19-2023

Place recognition is an important capability for autonomously navigating vehicles operating in complex environments and under changing conditions. It is a key component for tasks such as loop closing in SLAM or global localization. In this paper, we address the problem of place recognition based on 3D LiDAR scans recorded by an autonomous vehicle. We propose a novel lightweight neural network exploiting the range image representation of LiDAR sensors to achieve fast execution with less than 2 ms per frame. We design a yaw-angle-invariant architecture exploiting a transformer network, which boosts the place recognition performance of our method. We evaluate our approach on the KITTI and Ford Campus datasets. The experimental results show that our method can effectively detect loop closures compared to the state-of-the-art methods and generalizes well across different environments. To evaluate long-term place recognition performance, we provide a novel dataset containing LiDAR sequences recorded by a mobile robot in repetitive places at different times. The implementation of our method and dataset are released here: https://github.com/haomo-ai/OverlapTransformer

artificial intelligence, machine learning, place recognition, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2022.3178797

2203.03397

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation

Dong, Hao, Zhang, Xianjing, Xu, Jintao, Ai, Rui, Gu, Weihao, Lu, Huimin, Kannala, Juho, Chen, Xieyuanli

arXiv.org Artificial IntelligenceMar-16-2023

High-definition (HD) semantic map generation of the environment is an essential component of autonomous driving. Existing methods have achieved good performance in this task by fusing different sensor modalities, such as LiDAR and camera. However, current works are based on raw data or network feature-level fusion and only consider short-range HD map generation, limiting their deployment to realistic autonomous driving applications. In this paper, we focus on the task of building the HD maps in both short ranges, i.e., within 30 m, and also predicting long-range HD maps up to 90 m, which is required by downstream path planning and control tasks to improve the smoothness and safety of autonomous driving. To this end, we propose a novel network named SuperFusion, exploiting the fusion of LiDAR and camera data at multiple levels. We use LiDAR depth to improve image depth estimation and use image features to guide long-range LiDAR feature prediction. We benchmark our SuperFusion on the nuScenes dataset and a self-recorded dataset and show that it outperforms the state-of-the-art baseline methods with large margins on all intervals. Additionally, we apply the generated HD map to a downstream path planning task, demonstrating that the long-range HD maps predicted by our method can lead to better path planning for autonomous vehicles. Our code and self-recorded dataset will be available at https://github.com/haomo-ai/SuperFusion.

artificial intelligence, long-range hd map generation, multilevel lidar-camera fusion, (1 more...)

arXiv.org Artificial Intelligence

2211.15656

Genre: Research Report (0.40)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback