Hand-Object Interaction Pretraining from Videos

Singh, Himanshu Gaurav, Loquercio, Antonio, Sferrazza, Carmelo, Wu, Jane, Qi, Haozhi, Abbeel, Pieter, Malik, Jitendra

Sep-12-2024–arXiv.org Artificial Intelligence

Reusable sensorimotor representations have the potential to give robots access to the versatility of their sensorimotor apparatus, thereby enabling them to achieve a wide variety of goals. Similar to advancements in other AI domains [1, 2], such representations are likely to be trained with unsupervised objectives on large datasets. In this work, we study the feasibility of training such representations using human videos in the context of dexterous manipulation. Using videos as a data engine comes with several advantages: (1) they are abundant; (2) they cover a wide range of skills that we want robots to master; and (3) they reflect natural or socially acceptable behaviors that we want robots to emulate. However, training sensorimotor representations on videos is a challenging endeavor.

manipulation, trajectory, video, (17 more...)

arXiv.org Artificial Intelligence

Sep-12-2024

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Oxfordshire > Oxford (0.04)
- Asia > Japan
  - Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Robots > Manipulation (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found