AITopics | Wang, Junming

Collaborating Authors

Wang, Junming

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving

Song, Ziying, Jia, Caiyan, Liu, Lin, Pan, Hongyu, Zhang, Yongchang, Wang, Junming, Zhang, Xingyu, Xu, Shaoqing, Yang, Lei, Luo, Yadan

arXiv.org Artificial IntelligenceMar-6-2025

End-to-end autonomous driving frameworks enable seamless integration of perception and planning but often rely on one-shot trajectory prediction, which may lead to unstable control and vulnerability to occlusions in single-frame perception. To address this, we propose the Momentum-Aware Driving (MomAD) framework, which introduces trajectory momentum and perception momentum to stabilize and refine trajectory predictions. MomAD comprises two core components: (1) Topological Trajectory Matching (TTM) employs Hausdorff Distance to select the optimal planning query that aligns with prior paths to ensure coherence;(2) Momentum Planning Interactor (MPI) cross-attends the selected planning query with historical queries to expand static and dynamic perception files. This enriched query, in turn, helps regenerate long-horizon trajectory and reduce collision risks. To mitigate noise arising from dynamic environments and detection errors, we introduce robust instance denoising during training, enabling the planning model to focus on critical signals and improve its robustness. We also propose a novel Trajectory Prediction Consistency (TPC) metric to quantitatively assess planning stability. Experiments on the nuScenes dataset demonstrate that MomAD achieves superior long-term consistency (>=3s) compared to SOTA methods. Moreover, evaluations on the curated Turning-nuScenes shows that MomAD reduces the collision rate by 26% and improves TPC by 0.97m (33.45%) over a 6s prediction horizon, while closedloop on Bench2Drive demonstrates an up to 16.3% improvement in success rate.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2503.03125

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.73)
Information Technology > Robotics & Automation (0.64)
Automobiles & Trucks (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)

Add feedback

HE-Nav: A High-Performance and Efficient Navigation System for Aerial-Ground Robots in Cluttered Environments

Wang, Junming, Sun, Zekai, Guan, Xiuxian, Shen, Tianxiang, Huang, Dong, Zhang, Zongyuan, Duan, Tianyang, Liu, Fangming, Cui, Heming

arXiv.org Artificial IntelligenceOct-7-2024

Existing AGR navigation systems have advanced in lightly occluded scenarios (e.g., buildings) by employing 3D semantic scene completion networks for voxel occupancy prediction and constructing Euclidean Signed Distance Field (ESDF) maps for collision-free path planning. However, these systems exhibit suboptimal performance and efficiency in cluttered environments with severe occlusions (e.g., dense forests or tall walls), due to limitations arising from perception networks' low prediction accuracy and path planners' high computational overhead. In this paper, we present HE-Nav, the first high-performance and efficient navigation system tailored for AGRs operating in cluttered environments. The perception module utilizes a lightweight semantic scene completion network (LBSCNet), guided by a bird's eye view (BEV) feature fusion and enhanced by an exquisitely designed SCB-Fusion module and attention mechanism. This enables real-time and efficient obstacle prediction in cluttered areas, generating a complete local map. Building upon this completed map, our novel AG-Planner employs the energy-efficient kinodynamic A* search algorithm to guarantee planning is energy-saving. Subsequent trajectory optimization processes yield safe, smooth, dynamically feasible and ESDF-free aerial-ground hybrid paths. Extensive experiments demonstrate that HE-Nav achieved 7x energy savings in real-world situations while maintaining planning success rates of 98% in simulation scenarios. Code and video are available on our project page: https://jmwang0117.github.io/HE-Nav/.

artificial intelligence, he-nav, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.05079

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Industry: Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

Wang, Junming, Zhang, Xingyu, Xing, Zebin, Gu, Songen, Guo, Xiaoyang, Hu, Yang, Song, Ziying, Zhang, Qian, Long, Xiaoxiao, Yin, Wei

arXiv.org Artificial IntelligenceOct-7-2024

In this paper, we propose HE-Drive: the first human-like-centric end-to-end autonomous driving system to generate trajectories that are both temporally consistent and comfortable. Recent studies have shown that imitation learning-based planners and learning-based trajectory scorers can effectively generate and select accuracy trajectories that closely mimic expert demonstrations. However, such trajectory planners and scorers face the dilemma of generating temporally inconsistent and uncomfortable trajectories. To solve the above problems, Our HE-Drive first extracts key 3D spatial representations through sparse perception, which then serves as conditional inputs for a Conditional Denoising Diffusion Probabilistic Models (DDPMs)-based motion planner to generate temporal consistency multi-modal trajectories. A Vision-Language Models (VLMs)-guided trajectory scorer subsequently selects the most comfortable trajectory from these candidates to control the vehicle, ensuring human-like end-to-end driving. Experiments show that HE-Drive not only achieves state-of-the-art performance (i.e., reduces the average collision rate by 71% than VAD) and efficiency (i.e., 1.9X faster than SparseDrive) on the challenging nuScenes and OpenScene datasets but also provides the most comfortable driving experience on real-world data.For more information, visit the project website: https://jmwang0117.github.io/HE-Drive/.

large language model, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2410.05051

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Automobiles & Trucks (0.92)
Transportation > Ground > Road (0.52)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity

Wang, Junming, Yin, Wei, Long, Xiaoxiao, Zhang, Xingyu, Xing, Zebin, Guo, Xiaoyang, Zhang, Qian

arXiv.org Artificial IntelligenceOct-1-2024

3D semantic occupancy prediction networks have demonstrated remarkable capabilities in reconstructing the geometric and semantic structure of 3D scenes, providing crucial information for robot navigation and autonomous driving systems. However, due to their large overhead from dense network structure designs, existing networks face challenges balancing accuracy and latency. In this paper, we introduce OccRWKV, an efficient semantic occupancy network inspired by Receptance Weighted Key Value (RWKV). OccRWKV separates semantics, occupancy prediction, and feature fusion into distinct branches, each incorporating Sem-RWKV and Geo-RWKV blocks. These blocks are designed to capture long-range dependencies, enabling the network to learn domain-specific representation (i.e., semantics and geometry), which enhances prediction accuracy. Leveraging the sparse nature of real-world 3D occupancy, we reduce computational overhead by projecting features into the bird's-eye view (BEV) space and propose a BEV-RWKV block for efficient feature enhancement and fusion. This enables real-time inference at 22.2 FPS without compromising performance. Experiments demonstrate that OccRWKV outperforms the state-of-the-art methods on the SemanticKITTI dataset, achieving a mIoU of 25.1 while being 20 times faster than the best baseline, Co-Occ, making it suitable for real-time deployment on robots to enhance autonomous navigation efficiency. Code and video are available on our project page: https://jmwang0117.github.io/OccRWKV/.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2409.19987

Genre: Research Report > Promising Solution (0.48)

Industry: Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Hybrid-Parallel: Achieving High Performance and Energy Efficient Distributed Inference on Robots

Sun, Zekai, Guan, Xiuxian, Wang, Junming, Song, Haoze, Qing, Yuhao, Shen, Tianxiang, Huang, Dong, Liu, Fangming, Cui, Heming

arXiv.org Artificial IntelligenceMay-29-2024

The rapid advancements in machine learning techniques have led to significant achievements in various real-world robotic tasks. These tasks heavily rely on fast and energy-efficient inference of deep neural network (DNN) models when deployed on robots. To enhance inference performance, distributed inference has emerged as a promising approach, parallelizing inference across multiple powerful GPU devices in modern data centers using techniques such as data parallelism, tensor parallelism, and pipeline parallelism. However, when deployed on real-world robots, existing parallel methods fail to provide low inference latency and meet the energy requirements due to the limited bandwidth of robotic IoT. We present Hybrid-Parallel, a high-performance distributed inference system optimized for robotic IoT. Hybrid-Parallel employs a fine-grained approach to parallelize inference at the granularity of local operators within DNN layers (i.e., operators that can be computed independently with the partial input, such as the convolution kernel in the convolution layer). By doing so, Hybrid-Parallel enables different operators of different layers to be computed and transmitted concurrently, and overlap the computation and transmission phases within the same inference task. The evaluation demonstrate that Hybrid-Parallel reduces inference time by 14.9% ~41.1% and energy consumption per inference by up to 35.3% compared to the state-of-the-art baselines.

artificial intelligence, hybrid-parallel, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2405.19257

Country: Asia > China (0.30)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments

Wang, Junming, Sun, Zekai, Guan, Xiuxian, Shen, Tianxiang, Zhang, Zongyuan, Duan, Tianyang, Huang, Dong, Zhao, Shixiong, Cui, Heming

arXiv.org Artificial IntelligenceMar-18-2024

The exceptional mobility and long endurance of air-ground robots are raising interest in their usage to navigate complex environments (e.g., forests and large buildings). However, such environments often contain occluded and unknown regions, and without accurate prediction of unobserved obstacles, the movement of the air-ground robot often suffers a suboptimal trajectory under existing mapping-based and learning-based navigation methods. In this work, we present AGRNav, a novel framework designed to search for safe and energy-saving air-ground hybrid paths. AGRNav contains a lightweight semantic scene completion network (SCONet) with self-attention to enable accurate obstacle predictions by capturing contextual information and occlusion area features. The framework subsequently employs a query-based method for low-latency updates of prediction results to the grid map. Finally, based on the updated map, the hierarchical path planner efficiently searches for energy-saving paths for navigation. We validate AGRNav's performance through benchmarks in both simulated and real-world environments, demonstrating its superiority over classical and state-of-the-art methods. The open-source code is available at https://github.com/jmwang0117/AGRNav.

agrnav, artificial intelligence, sconet, (16 more...)

arXiv.org Artificial Intelligence

2403.11607

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

A Cost-Effective Framework for Preference Elicitation and Aggregation

Zhao, Zhibing, Li, Haoming, Wang, Junming, Kephart, Jeffrey, Mattei, Nicholas, Su, Hui, Xia, Lirong

arXiv.org Artificial IntelligenceMay-14-2018

We propose a cost-effective framework for preference elicitation and aggregation under Plackett-Luce model with features. Given a budget, our framework iteratively computes the most cost-effective elicitation questions in order to help the agents make better group decisions. We illustrate the viability of the framework with an experiment on Amazon Mechanical Turk, which estimates the cost of answering different types of elicitation questions. We compare the prediction accuracy of our framework when adopting various information criteria that evaluate the expected information gain from a question. Our experiments show carefully designed information criteria are much more efficient than randomly asking questions given budget constraint.

agent, artificial intelligence, health & medicine, (15 more...)

arXiv.org Artificial Intelligence

1805.05287

Country:

North America > United States > Oregon (0.14)
Europe > Spain > Catalonia (0.14)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Add feedback