Train Robots in a JIF: Joint Inverse and Forward Dynamics with Human and Robot Demonstrations

Khandate, Gagan, Wang, Boxuan, Park, Sarah, Ni, Weizhe, Palacious, Jaoquin, Lampo, Kate, Wu, Philippe, Ho, Rosh, Chang, Eric, Ciocarlie, Matei

Mar-15-2025–arXiv.org Artificial Intelligence

Pre-training on large datasets of robot demonstrations is a powerful technique for learning diverse manipulation skills but is often limited by the high cost and complexity of collecting robot-centric data, especially for tasks requiring tactile feedback. This work addresses these challenges by introducing a novel method for pre-training with multi-modal human demonstrations. Our approach jointly learns inverse and forward dynamics to extract latent state representations, towards learning manipulation specific representations. This enables efficient fine-tuning with only a small number of robot demonstrations, significantly improving data efficiency. Furthermore, our method allows for the use of multi-modal data, such as combination of vision and touch for manipulation. By leveraging latent dynamics modeling and tactile sensing, this approach paves the way for scalable robot manipulation learning based on human demonstrations.

artificial intelligence, demonstration, human demonstration, (14 more...)

arXiv.org Artificial Intelligence

Mar-15-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe > Switzerland
  - Basel-City > Basel (0.04)

Genre:
- Research Report > New Finding (0.94)

Technology:
- Information Technology > Artificial Intelligence > Robots > Manipulation (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found