AITopics | Wang, Chengyue

Plotting

Wang, Chengyue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Human Observation-Inspired Trajectory Prediction for Autonomous Driving in Mixed-Autonomy Traffic Environments

Liao, Haicheng, Liu, Shangqian, Li, Yongkang, Li, Zhenning, Wang, Chengyue, Wang, Bonan, Guan, Yanchen, Xu, Chengzhong

arXiv.org Artificial IntelligenceFeb-6-2024

In the burgeoning field of autonomous vehicles (AVs), trajectory prediction remains a formidable challenge, especially in mixed autonomy environments. Traditional approaches often rely on computational methods such as time-series analysis. Our research diverges significantly by adopting an interdisciplinary approach that integrates principles of human cognition and observational behavior into trajectory prediction models for AVs. We introduce a novel "adaptive visual sector" mechanism that mimics the dynamic allocation of attention human drivers exhibit based on factors like spatial orientation, proximity, and driving speed. Additionally, we develop a "dynamic traffic graph" using Convolutional Neural Networks (CNN) and Graph Attention Networks (GAT) to capture spatio-temporal dependencies among agents. Benchmark tests on the NGSIM, HighD, and MoCAD datasets reveal that our model (GAVA) outperforms state-of-the-art baselines by at least 15.2%, 19.4%, and 12.0%, respectively. Our findings underscore the potential of leveraging human cognition principles to enhance the proficiency and adaptability of trajectory prediction algorithms in AVs. The code for the proposed model is available at our Github.

artificial intelligence, machine learning, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2402.04318

Country:

Asia > Macao (0.15)
Asia > China (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (0.65)
Automobiles & Trucks (0.51)
Information Technology > Robotics & Automation (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models

Liao, Haicheng, Shen, Huanming, Li, Zhenning, Wang, Chengyue, Li, Guofa, Bie, Yiming, Xu, Chengzhong

arXiv.org Artificial IntelligenceDec-6-2023

In the field of autonomous vehicles (AVs), accurately discerning commander intent and executing linguistic commands within a visual context presents a significant challenge. This paper introduces a sophisticated encoder-decoder framework, developed to address visual grounding in AVs.Our Context-Aware Visual Grounding (CAVG) model is an advanced system that integrates five core encoders-Text, Image, Context, and Cross-Modal-with a Multimodal decoder. This integration enables the CAVG model to adeptly capture contextual semantics and to learn human emotional features, augmented by state-of-the-art Large Language Models (LLMs) including GPT-4. The architecture of CAVG is reinforced by the implementation of multi-head cross-modal attention mechanisms and a Region-Specific Dynamic (RSD) layer for attention modulation. This architectural design enables the model to efficiently process and interpret a range of cross-modal inputs, yielding a comprehensive understanding of the correlation between verbal commands and corresponding visual scenes. Empirical evaluations on the Talk2Car dataset, a real-world benchmark, demonstrate that CAVG establishes new standards in prediction accuracy and operational efficiency. Notably, the model exhibits exceptional performance even with limited training data, ranging from 50% to 75% of the full dataset. This feature highlights its effectiveness and potential for deployment in practical AV applications. Moreover, CAVG has shown remarkable robustness and adaptability in challenging scenarios, including long-text command interpretation, low-light conditions, ambiguous command contexts, inclement weather conditions, and densely populated urban environments. The code for the proposed model is available at our Github.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.03543

Country:

Asia > China (0.46)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.67)

Industry:

Transportation > Ground > Road (0.51)
Information Technology > Robotics & Automation (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback