Goto

Collaborating Authors

 Fang, Jiaying


Phantom: Training Robots Without Robots Using Only Human Videos

arXiv.org Artificial Intelligence

Our method enables training robot policies without collecting any robot data. We first collect human video demonstrations in diverse environments and use inpainting to remove the human hand. A rendered robot is then inserted into the scene using the estimated hand pose. The resulting augmented dataset is used to train an imitation learning policy, which is deployed zero-shot on a real robot. Abstract --Scaling robotics data collection is critical to advancing general-purpose robots. Current approaches often rely on teleoperated demonstrations which are difficult to scale. We propose a novel data collection method that eliminates the need for robotics hardware by leveraging human video demonstrations. By training imitation learning policies on this human data, our approach enables zero-shot deployment on robots without collecting any robot-specific data. T o bridge the embodiment gap between human and robot appearances, we utilize a data editing approach on the input observations that aligns the image distributions between training data on humans and test data on robots. Our method significantly reduces the cost of diverse data collection by allowing anyone with an RGBD camera to contribute. We demonstrate that our approach works in diverse, unseen environments and on varied tasks. I NTRODUCTION Data scarcity remains a key challenge in advancing robotics research. While large-scale data collection efforts are gaining momentum, even the largest robotics datasets [1, 7] are significantly smaller than those used to train generalist models in natural language processing and computer vision. These efforts are constrained by the slow and costly process of collecting data with robotics hardware.


Force-Aware Autonomous Robotic Surgery

arXiv.org Artificial Intelligence

This work demonstrates the benefits of using tool-tissue interaction forces in the design of autonomous systems in robot-assisted surgery (RAS). Autonomous systems in surgery must manipulate tissues of different stiffness levels and hence should apply different levels of forces accordingly. We hypothesize that this ability is enabled by using force measurements as input to policies learned from human demonstrations. To test this hypothesis, we use Action-Chunking Transformers (ACT) to train two policies through imitation learning for automated tissue retraction with the da Vinci Research Kit (dVRK). To quantify the effects of using tool-tissue interaction force data, we trained a "no force policy" that uses the vision and robot kinematic data, and compared it to a "force policy" that uses force, vision and robot kinematic data. When tested on a previously seen tissue sample, the force policy is 3 times more successful in autonomously performing the task compared with the no force policy. In addition, the force policy is more gentle with the tissue compared with the no force policy, exerting on average 62% less force on the tissue. When tested on a previously unseen tissue sample, the force policy is 3.5 times more successful in autonomously performing the task, exerting an order of magnitude less forces on the tissue, compared with the no force policy. These results open the door to design force-aware autonomous systems that can meet the surgical guidelines for tissue handling, especially using the newly released RAS systems with force feedback capabilities such as the da Vinci 5.