AITopics | Zhang, Qingwen

Collaborating Authors

Zhang, Qingwen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates

Wang, Wenjia, Zhang, Qingwen, Zhang, Xiaowei

arXiv.org Machine LearningMar-21-2025

Personalized services are central to today's digital landscape, where online decision-making is commonly formulated as contextual bandit problems. Two key challenges emerge in modern applications: high-dimensional covariates and the need for nonparametric models to capture complex reward-covariate relationships. We address these challenges by developing a contextual bandit algorithm based on sparse additive reward models in reproducing kernel Hilbert spaces. We establish statistical properties of the doubly penalized method applied to random regions, introducing novel analyses under bandit feedback. Our algorithm achieves sublinear cumulative regret over the time horizon $T$ while scaling logarithmically with covariate dimensionality $d$. Notably, we provide the first regret upper bound with logarithmic growth in $d$ for nonparametric contextual bandits with high-dimensional covariates. We also establish a lower bound, with the gap to the upper bound vanishing as smoothness increases. Extensive numerical experiments demonstrate our algorithm's superior performance in high-dimensional settings compared to existing approaches.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2503.16941

Country: Asia > China (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.87)

Add feedback

HiMo: High-Speed Objects Motion Compensation in Point Clouds

Zhang, Qingwen, Khoche, Ajinkya, Yang, Yi, Ling, Li, Mansouri, Sina Sharif, Andersson, Olov, Jensfelt, Patric

arXiv.org Artificial IntelligenceMar-2-2025

LiDAR point clouds often contain motion-induced distortions, degrading the accuracy of object appearances in the captured data. In this paper, we first characterize the underlying reasons for the point cloud distortion and show that this is present in public datasets. We find that this distortion is more pronounced in high-speed environments such as highways, as well as in multi-LiDAR configurations, a common setup for heavy vehicles. Previous work has dealt with point cloud distortion from the ego-motion but fails to consider distortion from the motion of other objects. We therefore introduce a novel undistortion pipeline, HiMo, that leverages scene flow estimation for object motion compensation, correcting the depiction of dynamic objects. We further propose an extension of a state-of-the-art self-supervised scene flow method. Due to the lack of well-established motion distortion metrics in the literature, we also propose two metrics for compensation performance evaluation: compensation accuracy at a point level and shape similarity on objects. To demonstrate the efficacy of our method, we conduct extensive experiments on the Argoverse 2 dataset and a new real-world dataset. Our new dataset is collected from heavy vehicles equipped with multi-LiDARs and on highways as opposed to mostly urban settings in the existing datasets. The source code, including all methods and the evaluation data, will be provided upon publication. See https://kin-zhang.github.io/HiMo for more details.

artificial intelligence, distortion, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2503.00803

Country:

Europe > Sweden (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Automobiles & Trucks (0.72)
Transportation (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation

Luo, Jiehao, Cheng, Jintao, Tang, Xiaoyu, Zhang, Qingwen, Xue, Bohuan, Fan, Rui

arXiv.org Artificial IntelligenceFeb-24-2025

Scene flow estimation aims to predict 3D motion from consecutive point cloud frames, which is of great interest in autonomous driving field. Existing methods face challenges such as insufficient spatio-temporal modeling and inherent loss of fine-grained feature during voxelization. However, the success of Mamba, a representative state space model (SSM) that enables global modeling with linear complexity, provides a promising solution. In this paper, we propose MambaFlow, a novel scene flow estimation network with a mamba-based decoder. It enables deep interaction and coupling of spatio-temporal features using a well-designed backbone. Innovatively, we steer the global attention modeling of voxel-based features with point offset information using an efficient Mamba-based decoder, learning voxel-to-point patterns that are used to devoxelize shared voxel representations into point-wise features. To further enhance the model's generalization capabilities across diverse scenarios, we propose a novel scene-adaptive loss function that automatically adapts to different motion patterns.Extensive experiments on the Argoverse 2 benchmark demonstrate that MambaFlow achieves state-of-the-art performance with real-time inference speed among existing works, enabling accurate flow estimation in real-world urban scenarios. The code is available at https://github.com/SCNU-RISLAB/MambaFlow.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.16907

Country:

Asia > China (0.96)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.35)

Add feedback

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Zhang, Qingwen, Yang, Yi, Li, Peizheng, Andersson, Olov, Jensfelt, Patric

arXiv.org Artificial IntelligenceJul-1-2024

Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow along with trained model weights.

artificial intelligence, flow estimation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2407.01702

Country: Europe > Sweden (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry:

Automobiles & Trucks (0.68)
Transportation > Ground > Road (0.44)
Information Technology > Robotics & Automation (0.44)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models

Yang, Yi, Zhang, Qingwen, Ikemura, Kei, Batool, Nazre, Folkesson, John

arXiv.org Artificial IntelligenceMay-31-2024

Addressing hard cases in autonomous driving, such as anomalous road users, extreme weather conditions, and complex traffic interactions, presents significant challenges. To ensure safety, it is crucial to detect and manage these scenarios effectively for autonomous driving systems. However, the rarity and high-risk nature of these cases demand extensive, diverse datasets for training robust models. Vision-Language Foundation Models (VLMs) have shown remarkable zero-shot capabilities as being trained on extensive datasets. This work explores the potential of VLMs in detecting hard cases in autonomous driving. We demonstrate the capability of VLMs such as GPT-4v in detecting hard cases in traffic participant motion prediction on both agent and scenario levels. We introduce a feasible pipeline where VLMs, fed with sequential image frames with designed prompts, effectively identify challenging agents or scenarios, which are verified by existing prediction models. Moreover, by taking advantage of this detection of hard cases by VLMs, we further improve the training efficiency of the existing motion prediction pipeline by performing data selection for the training samples suggested by GPT. We show the effectiveness and feasibility of our pipeline incorporating VLMs with state-of-the-art methods on NuScenes datasets. The code is accessible at https://github.com/KTH-RPL/Detect_VLM.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.20991

Country: Europe > Sweden (0.46)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Jia, Mingkai, Zhang, Qingwen, Yang, Bowen, Wu, Jin, Liu, Ming, Jensfelt, Patric

arXiv.org Artificial IntelligenceMay-12-2024

Global point clouds that correctly represent the static environment features can facilitate accurate localization and robust path planning. However, dynamic objects introduce undesired ghost tracks that are mixed up with the static environment. Existing dynamic removal methods normally fail to balance the performance in computational efficiency and accuracy. In response, we present BeautyMap to efficiently remove the dynamic points while retaining static features for high-fidelity global maps. Our approach utilizes a binary-encoded matrix to efficiently extract the environment features. With a bit-wise comparison between matrices of each frame and the corresponding map region, we can extract potential dynamic regions. Then we use coarse to fine hierarchical segmentation of the $z$-axis to handle terrain variations. The final static restoration module accounts for the range-visibility of each single scan and protects static points out of sight. Comparative experiments underscore BeautyMap's superior performance in both accuracy and efficiency against other dynamic points removal methods. The code is open-sourced at https://github.com/MKJia/BeautyMap.

artificial intelligence, beautymap, dynamic point, (13 more...)

arXiv.org Artificial Intelligence

2405.07283

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report (0.64)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

DUFOMap: Efficient Dynamic Awareness Mapping

Duberg, Daniel, Zhang, Qingwen, Jia, MingKai, Jensfelt, Patric

arXiv.org Artificial IntelligenceApr-12-2024

The dynamic nature of the real world is one of the main challenges in robotics. The first step in dealing with it is to detect which parts of the world are dynamic. A typical benchmark task is to create a map that contains only the static part of the world to support, for example, localization and planning. Current solutions are often applied in post-processing, where parameter tuning allows the user to adjust the setting for a specific dataset. In this paper, we propose DUFOMap, a novel dynamic awareness mapping framework designed for efficient online processing. Despite having the same parameter settings for all scenarios, it performs better or is on par with state-of-the-art methods. Ray casting is utilized to identify and classify fully observed empty regions. Since these regions have been observed empty, it follows that anything inside them at another time must be dynamic. Evaluation is carried out in various scenarios, including outdoor environments in KITTI and Argoverse 2, open areas on the KTH campus, and with different sensor types. DUFOMap outperforms the state of the art in terms of accuracy and computational efficiency. The source code, benchmarks, and links to the datasets utilized are provided. See https://kth-rpl.github.io/dufomap for more details.

artificial intelligence, dufomap, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.01449

Country:

Asia > China (0.14)
Europe > Sweden (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DeFlow: Decoder of Scene Flow Network in Autonomous Driving

Zhang, Qingwen, Yang, Yi, Fang, Heng, Geng, Ruoyu, Jensfelt, Patric

arXiv.org Artificial IntelligenceJan-29-2024

Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at https://github.com/KTH-RPL/deflow.

artificial intelligence, estimation, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2401.16122

Country: Europe > Sweden (0.28)

Genre: Research Report (0.64)

Industry:

Automobiles & Trucks (0.72)
Transportation > Ground > Road (0.62)
Information Technology > Robotics & Automation (0.62)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Human-Centric Autonomous Systems With LLMs for User Command Reasoning

Yang, Yi, Zhang, Qingwen, Li, Ci, Marta, Daniel Simões, Batool, Nazre, Folkesson, John

arXiv.org Artificial IntelligenceDec-19-2023

The evolution of autonomous driving has made remarkable advancements in recent years, evolving into a tangible reality. However, a human-centric large-scale adoption hinges on meeting a variety of multifaceted requirements. To ensure that the autonomous system meets the user's intent, it is essential to accurately discern and interpret user commands, especially in complex or emergency situations. To this end, we propose to leverage the reasoning capabilities of Large Language Models (LLMs) to infer system requirements from in-cabin users' commands. Through a series of experiments that include different LLM models and prompt designs, we explore the few-shot multivariate binary classification accuracy of system requirements from natural language textual commands. We confirm the general ability of LLMs to understand and reason about prompts but underline that their effectiveness is conditioned on the quality of both the LLM model and the design of appropriate sequential prompts. Code and models are public with the link \url{https://github.com/KTH-RPL/DriveCmd_LLM}.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.08206

Country: Europe > Sweden (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.69)
Transportation > Ground > Road (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

RMP: A Random Mask Pretrain Framework for Motion Prediction

Yang, Yi, Zhang, Qingwen, Gilles, Thomas, Batool, Nazre, Folkesson, John

arXiv.org Artificial IntelligenceSep-16-2023

As the pretraining technique is growing in popularity, little work has been done on pretrained learning-based motion prediction methods in autonomous driving. In this paper, we propose a framework to formalize the pretraining task for trajectory prediction of traffic participants. Within our framework, inspired by the random masked model in natural language processing (NLP) and computer vision (CV), objects' positions at random timesteps are masked and then filled in by the learned neural network (NN). By changing the mask profile, our framework can easily switch among a range of motion-related tasks. We show that our proposed pretraining framework is able to deal with noisy inputs and improves the motion prediction accuracy and miss rate, especially for objects occluded over time by evaluating it on Argoverse and NuScenes datasets.

machine learning, natural language, prediction, (18 more...)

arXiv.org Artificial Intelligence

2309.08989

Country:

Europe > Sweden (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry: Transportation > Ground > Road (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback