Goto

Collaborating Authors

 interpenetration



It Takes Two: Learning Interactive Whole-Body Control Between Humanoid Robots

Liu, Zuhong, Ge, Junhao, Xiong, Minhao, Gu, Jiahao, Tang, Bowei, Jing, Wei, Chen, Siheng

arXiv.org Artificial Intelligence

The true promise of humanoid robotics lies beyond single-agent autonomy: two or more humanoids must engage in physically grounded, socially meaningful whole-body interactions that echo the richness of human social interaction. However, single-humanoid methods suffer from the isolation issue, ignoring inter-agent dynamics and causing misaligned contacts, interpenetrations, and unrealistic motions. To address this, we present Harmanoid , a dual-humanoid motion imitation framework that transfers interacting human motions to two robots while preserving both kinematic fidelity and physical realism. Harmanoid comprises two key components: (i) contact-aware motion retargeting, which restores inter-body coordination by aligning SMPL contacts with robot vertices, and (ii) interaction-driven motion controller, which leverages interaction-specific rewards to enforce coordinated keypoints and physically plausible contacts. By explicitly modeling inter-agent contacts and interaction-aware dynamics, Harmanoid captures the coupled behaviors between humanoids that single-humanoid frameworks inherently overlook. Experiments demonstrate that Harmanoid significantly improves interactive motion imitation, surpassing existing single-humanoid frameworks that largely fail in such scenarios.



Multi-Person Interaction Generation from Two-Person Motion Priors

Xu, Wenning, Fan, Shiyu, Henderson, Paul, Ho, Edmond S. L.

arXiv.org Artificial Intelligence

Generating realistic human motion with high-level controls is a crucial task for social understanding, robotics, and animation. With high-quality MOCAP data becoming more available recently, a wide range of data-driven approaches have been presented. However, modelling multi-person interactions still remains a less explored area. In this paper, we present Graph-driven Interaction Sampling, a method that can generate realistic and diverse multi-person interactions by leveraging existing two-person motion diffusion models as motion priors. Instead of training a new model specific to multi-person interaction synthesis, our key insight is to spatially and temporally separate complex multi-person interactions into a graph structure of two-person interactions, which we name the Pairwise Interaction Graph. We thus decompose the generation task into simultaneous single-person motion generation conditioned on one other's motion. In addition, to reduce artifacts such as interpenetrations of body parts in generated multi-person interactions, we introduce two graph-dependent guidance terms into the diffusion sampling scheme. Unlike previous work, our method can produce various high-quality multi-person interactions without having repetitive individual motions. Extensive experiments demonstrate that our approach consistently outperforms existing methods in reducing artifacts when generating a wide range of two-person and multi-person interactions.


Modeling Dynamic Hand-Object Interactions with Applications to Human-Robot Handovers

Christen, Sammy

arXiv.org Artificial Intelligence

Humans frequently grasp, manipulate, and move objects. Interactive systems assist humans in these tasks, enabling applications in Embodied AI, human-robot interaction, and virtual reality. However, current methods in hand-object synthesis often neglect dynamics and focus on generating static grasps. The first part of this dissertation introduces dynamic grasp synthesis, where a hand grasps and moves an object to a target pose. We approach this task using physical simulation and reinforcement learning. We then extend this to bimanual manipulation and articulated objects, requiring fine-grained coordination between hands. In the second part of this dissertation, we study human-to-robot handovers. We integrate captured human motion into simulation and introduce a student-teacher framework that adapts to human behavior and transfers from sim to real. To overcome data scarcity, we generate synthetic interactions, increasing training diversity by 100x. Our user study finds no difference between policies trained on synthetic vs. real motions.


Target Pose Guided Whole-body Grasping Motion Generation for Digital Humans

Shao, Quanquan, Fang, Yi

arXiv.org Artificial Intelligence

Grasping manipulation is a fundamental mode for human interaction with daily life objects. The synthesis of grasping motion is also greatly demanded in many applications such as animation and robotics. In objects grasping research field, most works focus on generating the last static grasping pose with a parallel gripper or dexterous hand. Grasping motion generation for the full arm especially for the full humanlike intelligent agent is still under-explored. In this work, we propose a grasping motion generation framework for digital human which is an anthropomorphic intelligent agent with high degrees of freedom in virtual world. Given an object known initial pose in 3D space, we first generate a target pose for whole-body digital human based on off-the-shelf target grasping pose generation methods. With an initial pose and this generated target pose, a transformer-based neural network is used to generate the whole grasping trajectory, which connects initial pose and target pose smoothly and naturally. Additionally, two post optimization components are designed to mitigates foot-skating issue and hand-object interpenetration separately. Experiments are conducted on GRAB dataset to demonstrate effectiveness of this proposed method for whole-body grasping motion generation with randomly placed unknown objects.


Physically Plausible Full-Body Hand-Object Interaction Synthesis

Braun, Jona, Christen, Sammy, Kocabas, Muhammed, Aksan, Emre, Hilliges, Otmar

arXiv.org Artificial Intelligence

We propose a physics-based method for synthesizing dexterous hand-object interactions in a full-body setting. While recent advancements have addressed specific facets of human-object interactions, a comprehensive physics-based approach remains a challenge. Existing methods often focus on isolated segments of the interaction process and rely on data-driven techniques that may result in artifacts. In contrast, our proposed method embraces reinforcement learning (RL) and physics simulation to mitigate the limitations of data-driven approaches. Through a hierarchical framework, we first learn skill priors for both body and hand movements in a decoupled setting. The generic skill priors learn to decode a latent skill embedding into the motion of the underlying part. A high-level policy then controls hand-object interactions in these pretrained latent spaces, guided by task objectives of grasping and 3D target trajectory following. It is trained using a novel reward function that combines an adversarial style term with a task reward, encouraging natural motions while fulfilling the task incentives. Our method successfully accomplishes the complete interaction task, from approaching an object to grasping and subsequent manipulation. We compare our approach against kinematics-based baselines and show that it leads to more physically plausible motions.


Differentiable Collision Detection for a Set of Convex Primitives

Tracy, Kevin, Howell, Taylor A., Manchester, Zachary

arXiv.org Artificial Intelligence

Collision detection between objects is critical for simulation, control, and learning for robotic systems. However, existing collision detection routines are inherently non-differentiable, limiting their applications in gradient-based optimization tools. In this work, we propose DCOL: a fast and fully differentiable collision-detection framework that reasons about collisions between a set of composable and highly expressive convex primitive shapes. This is achieved by formulating the collision detection problem as a convex optimization problem that solves for the minimum uniform scaling applied to each primitive before they intersect. The optimization problem is fully differentiable with respect to the configurations of each primitive and is able to return a collision detection metric and contact points on each object, agnostic of interpenetration. We demonstrate the capabilities of DCOL on a range of robotics problems from trajectory optimization and contact physics, and have made an open-source implementation available.


Leveraging Symbolic Algebra Systems to Simulate Contact Dynamics in Rigid Body Systems

Asci, Simone, Nanjangud, Angadh

arXiv.org Artificial Intelligence

Collision detection plays a key role in the simulation of interacting rigid bodies. However, owing to its computational complexity current methods typically prioritize either maximizing processing speed or fidelity to real-world behaviors. Fast real-time detection is achieved by simulating collisions with simple geometric shapes whereas incorporating more realistic geometries with multiple points of contact requires considerable computing power which slows down collision detection. In this work, we present a new approach to modeling and simulating collision-inclusive multibody dynamics by leveraging computer algebra system (CAS). This approach offers flexibility in modeling a diverse set of multibody systems applications ranging from human biomechanics to space manipulators with docking interfaces, since the geometric relationships between points and rigid bodies are handled in a generalizable manner. We also analyze the performance of integrating this symbolic modeling approach with collision detection formulated either as a traditional overlap test or as a convex optimization problem. We compare these two collision detection methods in different scenarios and collision resolution using a penalty-based method to simulate dynamics. This work demonstrates an effective simplification in solving collision dynamics problems using a symbolic approach, especially for the algorithm based on convex optimization, which is simpler to implement and, in complex collision scenarios, faster than the overlap test.


Multi-Finger Grasping Like Humans

Du, Yuming, Weinzaepfel, Philippe, Lepetit, Vincent, Brégier, Romain

arXiv.org Artificial Intelligence

Robots with multi-fingered grippers could perform advanced manipulation tasks for us if we were able to properly specify to them what to do. In this study, we take a step in that direction by making a robot grasp an object like a grasping demonstration performed by a human. We propose a novel optimization-based approach for transferring human grasp demonstrations to any multi-fingered grippers, which produces robotic grasps that mimic the human hand orientation and the contact area with the object, while alleviating interpenetration. Extensive experiments with the Allegro and BarrettHand grippers show that our method leads to grasps more similar to the human demonstration than existing approaches, without requiring any gripper-specific tuning. We confirm these findings through a user study and validate the applicability of our approach on a real robot.