AITopics | Hu, Ruizhen

Collaborating Authors

Hu, Ruizhen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Jian, Juntao, Liu, Xiuping, Chen, Zixuan, Li, Manyi, Liu, Jian, Hu, Ruizhen

arXiv.org Artificial IntelligenceMar-25-2025

Recent advances in dexterous grasping synthesis have demonstrated significant progress in producing reasonable and plausible grasps for many task purposes. But it remains challenging to generalize to unseen object categories and diverse task instructions. In this paper, we propose G-DexGrasp, a retrieval-augmented generation approach that can produce high-quality dexterous hand configurations for unseen object categories and language-based task instructions. The key is to retrieve generalizable grasping priors, including the fine-grained contact part and the affordance-related distribution of relevant grasping instances, for the following synthesis pipeline. Specifically, the fine-grained contact part and affordance act as generalizable guidance to infer reasonable grasping configurations for unseen objects with a generative model, while the relevant grasping distribution plays as regularization to guarantee the plausibility of synthesized grasps during the subsequent refinement optimization. Our comparison experiments validate the effectiveness of our key designs for generalization and demonstrate the remarkable performance against the existing approaches.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2503.19457

Country:

Asia > China (0.46)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning the Geometric Mechanics of Robot Motion Using Gaussian Mixtures

Hu, Ruizhen, Revzen, Shai

arXiv.org Artificial IntelligenceFeb-7-2025

Data-driven models of robot motion constructed using principles from Geometric Mechanics have been shown to produce useful predictions of robot motion for a variety of robots. For robots with a useful number of DoF, these geometric mechanics models can only be constructed in the neighborhood of a gait. Here we show how Gaussian Mixture Models (GMM) can be used as a form of manifold learning that learns the structure of the Geometric Mechanics "motility map" and demonstrate: [i] a sizable improvement in prediction quality when compared to the previously published methods; [ii] a method that can be applied to any motion dataset and not only periodic gait data; [iii] a way to pre-process the data-set to facilitate extrapolation in places where the motility map is known to be linear. Our results can be applied anywhere a data-driven geometric motion model might be useful.

artificial intelligence, bittner, manifold, (17 more...)

arXiv.org Artificial Intelligence

2502.05309

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

Chen, Tianxing, Mu, Yao, Liang, Zhixuan, Chen, Zanxin, Peng, Shijia, Chen, Qiangyu, Xu, Mingkun, Hu, Ruizhen, Zhang, Hongyuan, Li, Xuelong, Luo, Ping

arXiv.org Artificial IntelligenceNov-27-2024

Recent advances in imitation learning for 3D robotic manipulation have shown promising results with diffusion-based policies. However, achieving human-level dexterity requires seamless integration of geometric precision and semantic understanding. We present G3Flow, a novel framework that constructs real-time semantic flow, a dynamic, object-centric 3D semantic representation by leveraging foundation models. Our approach uniquely combines 3D generative models for digital twin creation, vision foundation models for semantic feature extraction, and robust pose tracking for continuous semantic flow updates. This integration enables complete semantic understanding even under occlusions while eliminating manual annotation requirements. By incorporating semantic flow into diffusion policies, we demonstrate significant improvements in both terminal-constrained manipulation and cross-object generalization. Extensive experiments across five simulation tasks show that G3Flow consistently outperforms existing approaches, achieving up to 68.3% and 50.1% average success rates on terminal-constrained manipulation and cross-object generalization tasks respectively. Our results demonstrate the effectiveness of G3Flow in enhancing real-time dynamic semantic feature understanding for robotic manipulation policies.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

2411.18369

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.35)

Add feedback

PC-Planner: Physics-Constrained Self-Supervised Learning for Robust Neural Motion Planning with Shape-Aware Distance Function

Shen, Xujie, Peng, Haocheng, Yang, Zesong, Xu, Juzhan, Bao, Hujun, Hu, Ruizhen, Cui, Zhaopeng

arXiv.org Artificial IntelligenceSep-30-2024

Motion Planning (MP) is a critical challenge in robotics, especially pertinent with the burgeoning interest in embodied artificial intelligence. Traditional MP methods often struggle with high-dimensional complexities. Recently neural motion planners, particularly physics-informed neural planners based on the Eikonal equation, have been proposed to overcome the curse of dimensionality. However, these methods perform poorly in complex scenarios with shaped robots due to multiple solutions inherent in the Eikonal equation. To address these issues, this paper presents PC-Planner, a novel physics-constrained self-supervised learning framework for robot motion planning with various shapes in complex environments. To this end, we propose several physical constraints, including monotonic and optimal constraints, to stabilize the training process of the neural network with the Eikonal equation. Additionally, we introduce a novel shape-aware distance field that considers the robot's shape for efficient collision checking and Ground Truth (GT) speed computation. This field reduces the computational intensity, and facilitates adaptive motion planning at test time. Experiments in diverse scenarios with different robots demonstrate the superiority of the proposed method in efficiency and robustness for robot motion planning, particularly in complex environments.

artificial intelligence, constraint, robot, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3680528.3687651

2410.12805

Country:

Asia (0.96)
North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (0.34)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)

Add feedback

Learning Cross-hand Policies for High-DOF Reaching and Grasping

She, Qijin, Zhang, Shishun, Ye, Yunfan, Hu, Ruizhen, Xu, Kai

arXiv.org Artificial IntelligenceJul-15-2024

Reaching-and-grasping is a fundamental skill for robotic manipulation, but existing methods usually train models on a specific gripper and cannot be reused on another gripper. In this paper, we propose a novel method that can learn a unified policy model that can be easily transferred to different dexterous grippers. Our method consists of two stages: a gripper-agnostic policy model that predicts the displacements of pre-defined key points on the gripper, and a gripper-specific adaptation model that translates these displacements into adjustments for controlling the grippers' joints. The gripper state and interactions with objects are captured at the finger level using robust geometric representations, integrated with a transformer-based network to address variations in gripper morphology and geometry. In the experiments, we evaluate our method on several dexterous grippers and diverse objects, and the result shows that our method significantly outperforms the baseline methods. Pioneering the transfer of grasp policies across dexterous grippers, our method effectively demonstrates its potential for learning generalizable and transferable manipulation skills for various robotic hands.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.0915

Country:

Asia > Middle East > Israel (0.14)
Asia > China (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Spatial and Surface Correspondence Field for Interaction Transfer

Huang, Zeyu, Xu, Honghao, Huang, Haibin, Ma, Chongyang, Huang, Hui, Hu, Ruizhen

arXiv.org Artificial IntelligenceMay-6-2024

In this paper, we introduce a new method for the task of interaction transfer. Given an example interaction between a source object and an agent, our method can automatically infer both surface and spatial relationships for the agent and target objects within the same category, yielding more accurate and valid transfers. Specifically, our method characterizes the example interaction using a combined spatial and surface representation. We correspond the agent points and object points related to the representation to the target object space using a learned spatial and surface correspondence field, which represents objects as deformed and rotated signed distance fields. With the corresponded points, an optimization is performed under the constraints of our spatial and surface interaction representation and additional regularization. Experiments conducted on human-chair and hand-mug interaction transfer tasks show that our approach can handle larger geometry and topology variations between source and target shapes, significantly outperforming state-of-the-art methods.

artificial intelligence, interaction, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.03221

Country: Asia > China (0.16)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Synchronized Dual-arm Rearrangement via Cooperative mTSP

Li, Wenhao, Zhang, Shishun, Dai, Sisi, Huang, Hui, Hu, Ruizhen, Chen, Xiaohong, Xu, Kai

arXiv.org Artificial IntelligenceMar-12-2024

Synchronized dual-arm rearrangement is widely studied as a common scenario in industrial applications. It often faces scalability challenges due to the computational complexity of robotic arm rearrangement and the high-dimensional nature of dual-arm planning. To address these challenges, we formulated the problem as cooperative mTSP, a variant of mTSP where agents share cooperative costs, and utilized reinforcement learning for its solution. Our approach involved representing rearrangement tasks using a task state graph that captured spatial relationships and a cooperative cost matrix that provided details about action costs. Taking these representations as observations, we designed an attention-based network to effectively combine them and provide rational task scheduling. Furthermore, a cost predictor is also introduced to directly evaluate actions during both training and planning, significantly expediting the planning process. Our experimental results demonstrate that our approach outperforms existing methods in terms of both performance and planning efficiency.

artificial intelligence, machine learning, planning & scheduling, (16 more...)

arXiv.org Artificial Intelligence

2403.08191

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.87)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Learning Dual-arm Object Rearrangement for Cartesian Robots

Zhang, Shishun, She, Qijin, Li, Wenhao, Zhu, Chenyang, Wang, Yongjun, Hu, Ruizhen, Xu, Kai

arXiv.org Artificial IntelligenceFeb-21-2024

This work focuses on the dual-arm object rearrangement problem abstracted from a realistic industrial scenario of Cartesian robots. The goal of this problem is to transfer all the objects from sources to targets with the minimum total completion time. To achieve the goal, the core idea is to develop an effective object-to-arm task assignment strategy for minimizing the cumulative task execution time and maximizing the dual-arm cooperation efficiency. One of the difficulties in the task assignment is the scalability problem. As the number of objects increases, the computation time of traditional offline-search-based methods grows strongly for computational complexity. Encouraged by the adaptability of reinforcement learning (RL) in long-sequence task decisions, we propose an online task assignment decision method based on RL, and the computation time of our method only increases linearly with the number of objects. Further, we design an attention-based network to model the dependencies between the input states during the whole task execution process to help find the most reasonable object-to-arm correspondence in each task assignment round. In the experimental part, we adapt some search-based methods to this specific setting and compare our method with them. Experimental result shows that our approach achieves outperformance over search-based methods in total execution time and computational efficiency, and also verifies the generalization of our method to different numbers of objects. In addition, we show the effectiveness of our method deployed on the real robot in the supplementary video.

artificial intelligence, international conference, rearrangement, (15 more...)

arXiv.org Artificial Intelligence

2402.13634

Country:

Asia > China (0.14)
Asia > Japan (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Neural Packing: from Visual Sensing to Reinforcement Learning

Xu, Juzhan, Gong, Minglun, Zhang, Hao, Huang, Hui, Hu, Ruizhen

arXiv.org Artificial IntelligenceOct-16-2023

We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D. It constitutes a full solution pipeline from partial observations of input objects via RGBD sensing and recognition to final box placement, via robotic motion planning, to arrive at a compact packing in a target container. The technical core of our method is a neural network for TAP, trained via reinforcement learning (RL), to solve the NP-hard combinatorial optimization problem. Our network simultaneously selects an object to pack and determines the final packing location, based on a judicious encoding of the continuously evolving states of partially observed source objects and available spaces in the target container, using separate encoders both enabled with attention mechanisms. The encoded feature vectors are employed to compute the matching scores and feasibility masks of different pairings of box selection and available space configuration for packing strategy optimization. Extensive experiments, including ablation studies and physical packing execution by a real robot (Universal Robot UR5e), are conducted to evaluate our method in terms of its design choices, scalability, generalizability, and comparisons to baselines, including the most recent RL-based TAP solution. We also contribute the first benchmark for TAP which covers a variety of input settings and difficulty levels.

container, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2311.09233

Country:

Asia > China (0.15)
North America > Canada (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Semi-Weakly Supervised Object Kinematic Motion Prediction

Liu, Gengxin, Sun, Qian, Huang, Haibin, Ma, Chongyang, Guo, Yulan, Yi, Li, Huang, Hui, Hu, Ruizhen

arXiv.org Artificial IntelligenceApr-2-2023

Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters. Due to the large variations in both topological structure and geometric details of 3D objects, this remains a challenging task and the lack of large scale labeled data also constrain the performance of deep learning based approaches. In this paper, we tackle the task of object kinematic motion prediction problem in a semi-weakly supervised manner. Our key observations are two-fold. First, although 3D dataset with fully annotated motion labels is limited, there are existing datasets and methods for object part semantic segmentation at large scale. Second, semantic part segmentation and mobile part segmentation is not always consistent but it is possible to detect the mobile parts from the underlying 3D structure. Towards this end, we propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters, which are further refined based on geometric alignment. This network can be first trained on PartNet-Mobility dataset with fully labeled mobility information and then applied on PartNet dataset with fine-grained and hierarchical part-level segmentation. The network predictions yield a large scale of 3D objects with pseudo labeled mobility information and can further be used for weakly-supervised learning with pre-existing segmentation. Our experiments show there are significant performance boosts with the augmented data for previous method designed for kinematic motion prediction on 3D partial scans.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2303.17774

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback