AITopics | Xu, Cheng-Zhong

Plotting

Xu, Cheng-Zhong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Orientation Field for OSM-Guided Autonomous Navigation

Huang, Yuming, Gao, Wei, Zhang, Zhiyuan, Ghaffari, Maani, Song, Dezhen, Xu, Cheng-Zhong, Kong, Hui

arXiv.org Artificial IntelligenceMar-23-2025

OpenStreetMap (OSM) has gained popularity recently in autonomous navigation due to its public accessibility, lower maintenance costs, and broader geographical coverage. However, existing methods often struggle with noisy OSM data and incomplete sensor observations, leading to inaccuracies in trajectory planning. These challenges are particularly evident in complex driving scenarios, such as at intersections or facing occlusions. To address these challenges, we propose a robust and explainable two-stage framework to learn an Orientation Field (OrField) for robot navigation by integrating LiDAR scans and OSM routes. In the first stage, we introduce the novel representation, OrField, which can provide orientations for each grid on the map, reasoning jointly from noisy LiDAR scans and OSM routes. To generate a robust OrField, we train a deep neural network by encoding a versatile initial OrField and output an optimized OrField. Based on OrField, we propose two trajectory planners for OSM-guided robot navigation, called Field-RRT* and Field-Bezier, respectively, in the second stage by improving the Rapidly Exploring Random Tree (RRT) algorithm and Bezier curve to estimate the trajectories. Thanks to the robustness of OrField which captures both global and local information, Field-RRT* and Field-Bezier can generate accurate and reliable trajectories even in challenging conditions. We validate our approach through experiments on the SemanticKITTI dataset and our own campus dataset. The results demonstrate the effectiveness of our method, achieving superior performance in complex and noisy conditions. Our code for network training and real-world deployment is available at https://github.com/IMRL/OriField.

artificial intelligence, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2503.18276

Country:

Asia (0.28)
North America > United States > Michigan (0.14)

Genre: Research Report (0.70)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving

Yan, Tianyi, Yin, Junbo, Lang, Xianpeng, Yang, Ruigang, Xu, Cheng-Zhong, Shen, Jianbing

arXiv.org Artificial IntelligenceDec-22-2024

To enhance autonomous driving safety in complex scenarios, various methods have been proposed to simulate LiDAR point cloud data. Nevertheless, these methods often face challenges in producing high-quality, diverse, and controllable foreground objects. To address the needs of object-aware tasks in 3D perception, we introduce OLiDM, a novel framework capable of generating high-fidelity LiDAR data at both the object and the scene levels. OLiDM consists of two pivotal components: the Object-Scene Progressive Generation (OPG) module and the Object Semantic Alignment (OSA) module. OPG adapts to user-specific prompts to generate desired foreground objects, which are subsequently employed as conditions in scene generation, ensuring controllable outputs at both the object and scene levels. This also facilitates the association of user-defined object-level annotations with the generated LiDAR scenes. Moreover, OSA aims to rectify the misalignment between foreground objects and background scenes, enhancing the overall quality of the generated objects. The broad effectiveness of OLiDM is demonstrated across various LiDAR generation tasks, as well as in 3D perception tasks. Specifically, on the KITTI-360 dataset, OLiDM surpasses prior state-of-the-art methods such as UltraLiDAR by 17.5 in FPD. Additionally, in sparse-to-dense LiDAR completion, OLiDM achieves a significant improvement over LiDARGen, with a 57.47\% increase in semantic IoU. Moreover, OLiDM enhances the performance of mainstream 3D detectors by 2.4\% in mAP and 1.9\% in NDS, underscoring its potential in advancing object-aware 3D tasks. Code is available at: https://yanty123.github.io/OLiDM.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.17226

Country:

Europe > Germany (0.14)
Asia (0.14)

Genre: Research Report (0.84)

Industry:

Transportation > Ground > Road (0.71)
Automobiles & Trucks (0.71)
Information Technology > Robotics & Automation (0.61)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)

Add feedback

Diverse and Fine-Grained Instruction-Following Ability Exploration with Synthetic Data

Gu, Zihui, Sun, Xingwu, Lian, Fengzong, Kang, Zhanhui, Xu, Cheng-Zhong, Fan, Ju

arXiv.org Artificial IntelligenceJul-4-2024

Instruction-following is particularly crucial for large language models (LLMs) to support diverse user requests. While existing work has made progress in aligning LLMs with human preferences, evaluating their capabilities on instruction following remains a challenge due to complexity and diversity of real-world user instructions. While existing evaluation methods focus on general skills, they suffer from two main shortcomings, i.e., lack of fine-grained task-level evaluation and reliance on singular instruction expression. To address these problems, this paper introduces DINGO, a fine-grained and diverse instruction-following evaluation dataset that has two main advantages: (1) DINGO is based on a manual annotated, fine-grained and multi-level category tree with 130 nodes derived from real-world user requests; (2) DINGO includes diverse instructions, generated by both GPT-4 and human experts. Through extensive experiments, we demonstrate that DINGO can not only provide more challenging and comprehensive evaluation for LLMs, but also provide task-level fine-grained directions to further improve LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.03942

Country:

Asia (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning

Chen, Zhongzhi, Sun, Xingwu, Jiao, Xianfeng, Lian, Fengzong, Kang, Zhanhui, Wang, Di, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceJan-14-2024

Despite the great success of large language models (LLMs) in various tasks, they suffer from generating hallucinations. We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes. Specifically, it creates multiple orthogonal bases for modeling truth by incorporating orthogonal constraints into the probes. Moreover, we introduce Random Peek, a systematic technique considering an extended range of positions within the sequence, reducing the gap between discerning and generating truth features in LLMs. By employing this approach, we improved the truthfulness of Llama-2-7B from 40.8\% to 74.5\% on TruthfulQA. Likewise, significant improvements are observed in fine-tuned models. We conducted a thorough analysis of truth features using probes. Our visualization results show that orthogonal probes capture complementary truth-related features, forming well-defined clusters that reveal the inherent structure of the dataset.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2312.17484

Country:

Asia (1.00)
Africa (1.00)
South America (0.92)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Personal > Honors (1.00)

Industry:

Transportation > Air (1.00)
Media > Film (1.00)
Leisure & Entertainment > Sports (1.00)
(27 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

DME-Driver: Integrating Human Decision Logic and 3D Scene Perception in Autonomous Driving

Han, Wencheng, Guo, Dongqian, Xu, Cheng-Zhong, Shen, Jianbing

arXiv.org Artificial IntelligenceJan-7-2024

In the field of autonomous driving, two important features of autonomous driving car systems are the explainability of decision logic and the accuracy of environmental perception. This paper introduces DME-Driver, a new autonomous driving system that enhances the performance and reliability of autonomous driving system. DME-Driver utilizes a powerful vision language model as the decision-maker and a planning-oriented perception model as the control signal generator. To ensure explainable and reliable driving decisions, the logical decision-maker is constructed based on a large vision language model. This model follows the logic employed by experienced human drivers and makes decisions in a similar manner. On the other hand, the generation of accurate control signals relies on precise and detailed environmental perception, which is where 3D scene perception models excel. Therefore, a planning oriented perception model is employed as the signal generator. It translates the logical decisions made by the decision-maker into accurate control signals for the self-driving cars. To effectively train the proposed model, a new dataset for autonomous driving was created. This dataset encompasses a diverse range of human driver behaviors and their underlying motivations. By leveraging this dataset, our model achieves high-precision planning accuracy through a logical thinking process.

artificial intelligence, decision-maker, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2401.03641

Country: Asia (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

Xiang, Li, Yin, Junbo, Li, Wei, Xu, Cheng-Zhong, Yang, Ruigang, Shen, Jianbing

arXiv.org Artificial IntelligenceDec-25-2023

Vehicle-to-Everything (V2X) collaborative perception has recently gained significant attention due to its capability to enhance scene understanding by integrating information from various agents, e.g., vehicles, and infrastructure. However, current works often treat the information from each agent equally, ignoring the inherent domain gap caused by the utilization of different LiDAR sensors of each agent, thus leading to suboptimal performance. In this paper, we propose DI-V2X, that aims to learn Domain-Invariant representations through a new distillation framework to mitigate the domain discrepancy in the context of V2X 3D object detection. DI-V2X comprises three essential components: a domain-mixing instance augmentation (DMA) module, a progressive domain-invariant distillation (PDD) module, and a domain-adaptive fusion (DAF) module. Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation. Next, PDD encourages the student models from different domains to gradually learn a domain-invariant feature representation towards the teacher, where the overlapping regions between agents are employed as guidance to facilitate the distillation process. Furthermore, DAF closes the domain gap between the students by incorporating calibration-aware domain-adaptive attention. Extensive experiments on the challenging DAIR-V2X and V2XSet benchmark datasets demonstrate DI-V2X achieves remarkable performance, outperforming all the previous V2X models. Code is available at https://github.com/Serenos/DI-V2X

artificial intelligence, di-v2x, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2312.15742

Country: Asia (0.14)

Genre: Research Report (0.82)

Industry: Education (0.90)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)

Add feedback

Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization

Hua, Hang, Li, Xingjian, Dou, Dejing, Xu, Cheng-Zhong, Luo, Jiebo

arXiv.org Artificial IntelligenceNov-8-2023

The advent of large-scale pre-trained language models has contributed greatly to the recent progress in natural language processing. Many state-of-the-art language models are first trained on a large text corpus and then fine-tuned on downstream tasks. Despite its recent success and wide adoption, fine-tuning a pre-trained language model often suffers from overfitting, which leads to poor generalizability due to the extremely high complexity of the model and the limited training samples from downstream tasks. To address this problem, we propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR). Specifically, we propose to inject the standard Gaussian noise or In-manifold noise and regularize hidden representations of the fine-tuned model. We first provide theoretical analyses to support the efficacy of our method. We then demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART. While these previous works only verify the effectiveness of their methods on relatively simple text classification tasks, we also verify the effectiveness of our method on question answering tasks, where the target problem is much more difficult and more training examples are available. Furthermore, extensive experimental results indicate that the proposed algorithm can not only enhance the in-domain performance of the language models but also improve the domain generalization performance on out-of-domain data.

artificial intelligence, natural language, ps 2206, (9 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2023.3330926

2206.05658

Genre: Research Report (0.76)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.53)

Add feedback

Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Qin, Tianrui, Gao, Xitong, Zhao, Juanjuan, Ye, Kejiang, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceMar-27-2023

Unlearnable example attacks are data poisoning techniques that can be used to safeguard public data against unauthorized use for training deep learning models. These methods add stealthy perturbations to the original image, thereby making it difficult for deep learning models to learn from these training data effectively. Current research suggests that adversarial training can, to a certain degree, mitigate the impact of unlearnable example attacks, while common data augmentation methods are not effective against such poisons. Adversarial training, however, demands considerable computational resources and can result in non-trivial accuracy loss. In this paper, we introduce the UEraser method, which outperforms current defenses against different types of state-of-the-art unlearnable example attacks through a combination of effective data augmentation policies and loss-maximizing adversarial augmentations. In stark contrast to the current SOTA adversarial training methods, UEraser uses adversarial augmentations, which extends beyond the confines of $ \ell_p $ perturbation budget assumed by current unlearning attacks and defenses. It also helps to improve the model's generalization ability, thus protecting against accuracy loss. UEraser wipes out the unlearning effect with error-maximizing data augmentations, thus restoring trained model accuracies. Interestingly, UEraser-Lite, a fast variant without adversarial augmentations, is also highly effective in preserving clean accuracies. On challenging unlearnable CIFAR-10, CIFAR-100, SVHN, and ImageNet-subset datasets produced with various attacks, it achieves results that are comparable to those obtained during clean training. We also demonstrate its efficacy against possible adaptive attacks. Our code is open source and available to the deep learning community: https://github.com/lafeat/ueraser.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2303.15127

Country: Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Federated Noisy Client Learning

Li, Li, Fu, Huazhu, Han, Bo, Xu, Cheng-Zhong, Shao, Ling

arXiv.org Artificial IntelligenceJun-24-2021

Federated learning (FL) collaboratively aggregates a shared global model depending on multiple local clients, while keeping the training data decentralized in order to preserve data privacy. However, standard FL methods ignore the noisy client issue, which may harm the overall performance of the aggregated model. In this paper, we first analyze the noisy client statement, and then model noisy clients with different noise distributions (e.g., Bernoulli and truncated Gaussian distributions). To learn with noisy clients, we propose a simple yet effective FL framework, named Federated Noisy Client Learning (Fed-NCL), which is a plug-and-play algorithm and contains two main components: a data quality measurement (DQM) to dynamically quantify the data quality of each participating client, and a noise robust aggregation (NRA) to adaptively aggregate the local models of each client by jointly considering the amount of local training data and the data quality of each client. Our Fed-NCL can be easily applied in any standard FL workflow to handle the noisy client issue. Experimental results on various datasets demonstrate that our algorithm boosts the performances of different state-of-the-art systems with noisy clients.

federated noisy client learning

arXiv.org Artificial Intelligence

2106.13239

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.73)
Information Technology > Security & Privacy (0.53)

Add feedback

Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning

Li, Zhenning, Yu, Hao, Zhang, Guohui, Dong, Shangjia, Xu, Cheng-Zhong

arXiv.org Artificial IntelligenceApr-20-2021

Inefficient traffic control may cause numerous problems such as traffic congestion and energy waste. This paper proposes a novel multi-agent reinforcement learning method, named KS-DDPG (Knowledge Sharing Deep Deterministic Policy Gradient) to achieve optimal control by enhancing the cooperation between traffic signals. By introducing the knowledge-sharing enabled communication protocol, each agent can access to the collective representation of the traffic environment collected by all agents. The proposed method is evaluated through two experiments respectively using synthetic and real-world datasets. The comparison with state-of-the-art reinforcement learning-based and conventional transportation methods demonstrate the proposed KS-DDPG has significant efficiency in controlling large-scale transportation networks and coping with fluctuations in traffic flow. In addition, the introduced communication mechanism has also been proven to speed up the convergence of the model without significantly increasing the computational burden.

agent, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.trc.2021.103059

2104.09936

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Add feedback