AITopics | Liu, Yun-Hui

Collaborating Authors

Liu, Yun-Hui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Leveraging Semantic Graphs for Efficient and Robust LiDAR SLAM

Wang, Neng, Lu, Huimin, Zheng, Zhiqiang, Wang, Hesheng, Liu, Yun-Hui, Chen, Xieyuanli

arXiv.org Artificial IntelligenceMar-14-2025

Accurate and robust simultaneous localization and mapping (SLAM) is crucial for autonomous mobile systems, typically achieved by leveraging the geometric features of the environment. Incorporating semantics provides a richer scene representation that not only enhances localization accuracy in SLAM but also enables advanced cognitive functionalities for downstream navigation and planning tasks. Existing point-wise semantic LiDAR SLAM methods often suffer from poor efficiency and generalization, making them less robust in diverse real-world scenarios. In this paper, we propose a semantic graph-enhanced SLAM framework, named SG-SLAM, which effectively leverages the geometric, semantic, and topological characteristics inherent in environmental structures. The semantic graph serves as a fundamental component that facilitates critical functionalities of SLAM, including robust relocalization during odometry failures, accurate loop closing, and semantic graph map construction. Our method employs a dual-threaded architecture, with one thread dedicated to online odometry and relocalization, while the other handles loop closure, pose graph optimization, and map update. This design enables our method to operate in real time and generate globally consistent semantic graph maps and point cloud maps. We extensively evaluate our method across the KITTI, MulRAN, and Apollo datasets, and the results demonstrate its superiority compared to state-of-the-art methods. Our method has been released at https://github.com/nubot-nudt/SG-SLAM.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.11145

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Image-Goal Navigation Using Refined Feature Guidance and Scene Graph Enhancement

Feng, Zhicheng, Chen, Xieyuanli, Shi, Chenghao, Luo, Lun, Chen, Zhichao, Liu, Yun-Hui, Lu, Huimin

arXiv.org Artificial IntelligenceMar-13-2025

In this paper, we introduce a novel image-goal navigation approach, named RFSG. Our focus lies in leveraging the fine-grained connections between goals, observations, and the environment within limited image data, all the while keeping the navigation architecture simple and lightweight. To this end, we propose the spatial-channel attention mechanism, enabling the network to learn the importance of multi-dimensional features to fuse the goal and observation features. In addition, a selfdistillation mechanism is incorporated to further enhance the feature representation capabilities. Given that the navigation task needs surrounding environmental information for more efficient navigation, we propose an image scene graph to establish feature associations at both the image and object levels, effectively encoding the surrounding scene information. Crossscene performance validation was conducted on the Gibson and HM3D datasets, and the proposed method achieved stateof-the-art results among mainstream methods, with a speed of up to 53.5 frames per second on an RTX3080. This contributes to the realization of end-to-end image-goal navigation in realworld scenarios. The implementation and model of our method have been released at: https://github.com/nubot-nudt/RFSG.

artificial intelligence, machine learning, navigation, (17 more...)

arXiv.org Artificial Intelligence

2503.10986

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Overcoming Support Dilution for Robust Few-shot Semantic Segmentation

Tang, Wailing, Yang, Biqi, Heng, Pheng-Ann, Liu, Yun-Hui, Fu, Chi-Wing

arXiv.org Artificial IntelligenceJan-23-2025

Few-shot Semantic Segmentation (FSS) is a challenging task that utilizes limited support images to segment associated unseen objects in query images. However, recent FSS methods are observed to perform worse, when enlarging the number of shots. As the support set enlarges, existing FSS networks struggle to concentrate on the high-contributed supports and could easily be overwhelmed by the low-contributed supports that could severely impair the mask predictions. In this work, we study this challenging issue, called support dilution, our goal is to recognize, select, preserve, and enhance those high-contributed supports in the raw support pool. Technically, our method contains three novel parts. First, we propose a contribution index, to quantitatively estimate if a high-contributed support dilutes. Second, we develop the Symmetric Correlation (SC) module to preserve and enhance the high-contributed support features, minimizing the distraction by the low-contributed features. Third, we design the Support Image Pruning operation, to retrieve a compact and high quality subset by discarding low-contributed supports. We conduct extensive experiments on two FSS benchmarks, COCO-20i and PASCAL-5i, the segmentation results demonstrate the compelling performance of our solution over state-of-the-art FSS approaches. Besides, we apply our solution for online segmentation and real-world segmentation, convincing segmentation results showing the practical ability of our work for real-world demonstrations.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2501.13529

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine > Therapeutic Area (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning to Hop for a Single-Legged Robot with Parallel Mechanism

Zhang, Hongbo, Chu, Xiangyu, Chen, Yanlin, Tang, Yunxi, Yue, Linzhu, Liu, Yun-Hui, Au, Kwok Wai Samuel

arXiv.org Artificial IntelligenceJan-21-2025

This work presents the application of reinforcement learning to improve the performance of a highly dynamic hopping system with a parallel mechanism. Unlike serial mechanisms, parallel mechanisms can not be accurately simulated due to the complexity of their kinematic constraints and closed-loop structures. Besides, learning to hop suffers from prolonged aerial phase and the sparse nature of the rewards. To address them, we propose a learning framework to encode long-history feedback to account for the under-actuation brought by the prolonged aerial phase. In the proposed framework, we also introduce a simplified serial configuration for the parallel design to avoid directly simulating parallel structure during the training. A torque-level conversion is designed to deal with the parallel-serial conversion to handle the sim-to-real issue. Simulation and hardware experiments have been conducted to validate this framework.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2501.11945

Genre: Research Report (0.50)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GeoManip: Geometric Constraints as General Interfaces for Robot Manipulation

Tang, Weiliang, Pan, Jia-Hui, Liu, Yun-Hui, Tomizuka, Masayoshi, Li, Li Erran, Fu, Chi-Wing, Ding, Mingyu

arXiv.org Artificial IntelligenceJan-16-2025

We present GeoManip, a framework to enable generalist robots to leverage essential conditions derived from object and part relationships, as geometric constraints, for robot manipulation. For example, cutting the carrot requires adhering to a geometric constraint: the blade of the knife should be perpendicular to the carrot's direction. By interpreting these constraints through symbolic language representations and translating them into low-level actions, GeoManip bridges the gap between natural language and robotic execution, enabling greater generalizability across diverse even unseen tasks, objects, and scenarios. Unlike vision-language-action models that require extensive training, operates training-free by utilizing large foundational models: a constraint generation module that predicts stage-specific geometric constraints and a geometry parser that identifies object parts involved in these constraints. A solver then optimizes trajectories to satisfy inferred constraints from task descriptions and the scene. Furthermore, GeoManip learns in-context and provides five appealing human-robot interaction features: on-the-fly policy adaptation, learning from human demonstrations, learning from failure cases, long-horizon action planning, and efficient data collection for imitation learning. Extensive evaluations on both simulations and real-world scenarios demonstrate GeoManip's state-of-the-art performance, with superior out-of-distribution generalization while avoiding costly model training.

constraint, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2501.09783

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adapting Segment Anything Model for Unseen Object Instance Segmentation

Cao, Rui, Song, Chuanxin, Yang, Biqi, Wang, Jiangliu, Heng, Pheng-Ann, Liu, Yun-Hui

arXiv.org Artificial IntelligenceSep-23-2024

Unseen Object Instance Segmentation (UOIS) is crucial for autonomous robots operating in unstructured environments. Previous approaches require full supervision on large-scale tabletop datasets for effective pretraining. In this paper, we propose UOIS-SAM, a data-efficient solution for the UOIS task that leverages SAM's high accuracy and strong generalization capabilities. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder, mitigating issues introduced by the SAM baseline, such as background confusion and over-segmentation, especially in scenarios involving occlusion and texture-rich objects. Extensive experimental results on OCID, OSD, and additional photometrically challenging datasets including PhoCAL and HouseCat6D, demonstrate that, even using only 10% of the training samples compared to previous methods, UOIS-SAM achieves state-of-the-art performance in unseen object segmentation, highlighting its effectiveness and robustness in various tabletop scenes.

artificial intelligence, machine learning, uois-sam, (15 more...)

arXiv.org Artificial Intelligence

2409.15481

Country: Asia > China (0.15)

Genre: Research Report (0.40)

Industry: Media (0.37)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Adaptive Model Predictive Control with Data-driven Error Model for Quadrupedal Locomotion

Zeng, Xuanqi, Zhang, Hongbo, Yue, Linzhu, Song, Zhitao, Zhang, Linwei, Liu, Yun-Hui

arXiv.org Artificial IntelligenceJul-14-2024

Model Predictive Control (MPC) relies heavily on the robot model for its control law. However, a gap always exists between the reduced-order control model with uncertainties and the real robot, which degrades its performance. To address this issue, we propose the controller of integrating a data-driven error model into traditional MPC for quadruped robots. Our approach leverages real-world data from sensors to compensate for defects in the control model. Specifically, we employ the Autoregressive Moving Average Vector (ARMAV) model to construct the state error model of the quadruped robot using data. The predicted state errors are then used to adjust the predicted future robot states generated by MPC. By such an approach, our proposed controller can provide more accurate inputs to the system, enabling it to achieve desired states even in the presence of model parameter inaccuracies or disturbances. The proposed controller exhibits the capability to partially eliminate the disparity between the model and the real-world robot, thereby enhancing the locomotion performance of quadruped robots. We validate our proposed method through simulations and real-world experimental trials on a large-size quadruped robot that involves carrying a 20 kg un-modeled payload (84% of body weight).

artificial intelligence, controller, robot, (16 more...)

arXiv.org Artificial Intelligence

2407.10124

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas > Upstream (0.62)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

Add feedback

A Fast Online Omnidirectional Quadrupedal Jumping Framework Via Virtual-Model Control and Minimum Jerk Trajectory Generation

Yue, Linzhu, Zhang, Lingwei, Song, Zhitao, Zhang, Hongbo, Dong, Jinhu, Zeng, Xuanqi, Liu, Yun-Hui

arXiv.org Artificial IntelligenceJun-30-2024

Exploring the limits of quadruped robot agility, particularly in the context of rapid and real-time planning and execution of omnidirectional jump trajectories, presents significant challenges due to the complex dynamics involved, especially when considering significant impulse contacts. This paper introduces a new framework to enable fast, omnidirectional jumping capabilities for quadruped robots. Utilizing minimum jerk technology, the proposed framework efficiently generates jump trajectories that exploit its analytical solutions, ensuring numerical stability and dynamic compatibility with minimal computational resources. The virtual model control is employed to formulate a Quadratic Programming (QP) optimization problem to accurately track the Center of Mass (CoM) trajectories during the jump phase. The whole-body control strategies facilitate precise and compliant landing motion. Moreover, the different jumping phase is triggered by time-schedule. The framework's efficacy is demonstrated through its implementation on an enhanced version of the open-source Mini Cheetah robot. Omnidirectional jumps-including forward, backward, and other directional-were successfully executed, showcasing the robot's capability to perform rapid and consecutive jumps with an average trajectory generation and tracking solution time of merely 50 microseconds.

artificial intelligence, optimization problem, robot, (16 more...)

arXiv.org Artificial Intelligence

2407.00658

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)

Add feedback

Ada-Tracker: Soft Tissue Tracking via Inter-Frame and Adaptive-Template Matching

Guo, Jiaxin, Wang, Jiangliu, Li, Zhaoshuo, Jia, Tongyu, Dou, Qi, Liu, Yun-Hui

arXiv.org Artificial IntelligenceMay-24-2024

Soft tissue tracking is crucial for computer-assisted interventions. Existing approaches mainly rely on extracting discriminative features from the template and videos to recover corresponding matches. However, it is difficult to adopt these techniques in surgical scenes, where tissues are changing in shape and appearance throughout the surgery. To address this problem, we exploit optical flow to naturally capture the pixel-wise tissue deformations and adaptively correct the tracked template. Specifically, we first implement an inter-frame matching mechanism to extract a coarse region of interest based on optical flow from consecutive frames. To accommodate appearance change and alleviate drift, we then propose an adaptive-template matching method, which updates the tracked template based on the reliability of the estimates. Our approach, Ada-Tracker, enjoys both short-term dynamics modeling by capturing local deformations and long-term dynamics modeling by introducing global temporal compensation. We evaluate our approach on the public SurgT benchmark, which is generated from Hamlyn, SCARED, and Kidney boundary datasets. The experimental results show that Ada-Tracker achieves superior accuracy and performs more robustly against prior works. Code is available at https://github.com/wrld/Ada-Tracker.

artificial intelligence, machine learning, optical flow, (15 more...)

arXiv.org Artificial Intelligence

2403.06479

Country: Asia > China (0.47)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Surgery (0.93)
Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scalable 3D Registration via Truncated Entry-wise Absolute Residuals

Huang, Tianyu, Peng, Liangzu, Vidal, René, Liu, Yun-Hui

arXiv.org Artificial IntelligenceApr-9-2024

Given an input set of $3$D point pairs, the goal of outlier-robust $3$D registration is to compute some rotation and translation that align as many point pairs as possible. This is an important problem in computer vision, for which many highly accurate approaches have been recently proposed. Despite their impressive performance, these approaches lack scalability, often overflowing the $16$GB of memory of a standard laptop to handle roughly $30,000$ point pairs. In this paper, we propose a $3$D registration approach that can process more than ten million ($10^7$) point pairs with over $99\%$ random outliers. Moreover, our method is efficient, entails low memory costs, and maintains high accuracy at the same time. We call our method TEAR, as it involves minimizing an outlier-robust loss that computes Truncated Entry-wise Absolute Residuals. To minimize this loss, we decompose the original $6$-dimensional problem into two subproblems of dimensions $3$ and $2$, respectively, solved in succession to global optimality via a customized branch-and-bound method. While branch-and-bound is often slow and unscalable, this does not apply to TEAR as we propose novel bounding functions that are tight and computationally efficient. Experiments on various datasets are conducted to validate the scalability and efficiency of our method.

artificial intelligence, registration, truncated entry-wise absolute residual

arXiv.org Artificial Intelligence

2404.00915

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback