PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

Thomas, Garrett, Cheng, Ching-An, Loynd, Ricky, Frujeri, Felipe Vieira, Vineet, Vibhav, Jalobeanu, Mihai, Kolobov, Andrey

arXiv.org Artificial Intelligence 

Transformers [1] have lead to breakthroughs in training large-scale general representations for computer vision (CV) and natural language processing (NLP) [2], enabling zero-shot adaptation and fast finetuning [3]. At the same time, despite impressive progress, transformer-based representations haven't shown the same versatility for robotic manipulation. Some attribute this gap to the lack of suitable training data for robotics [3]. We argue instead that data relevant to training robotic manipulation models is copious but has important structure that most existing training methods ignore and fail to leverage. These insights lead us to propose a novel transformer-based architecture, called PLEX, that is capable of effective learning from realistically available robotic manipulation datasets. We observe that robotics-relevant data falls into three major categories: (1) Video-only data, which contain high-quality and potentially description-annotated demonstrations for an immense variety of tasks but have no explicit action information for a robot to mimic; (2) Data containing matching sequences of percepts and actions, which are less plentiful than pure videos and don't necessarily correspond to meaningful tasks [4], but capture valuable correlations between a robot's actions and changes in the environment and are easy to collect on a given robot; (3) Small sets of high-quality sensorimotor demonstrations for a target task in a target environment. Thus, a scalable model architecture for robotic manipulation must be able to learn primarily from videos, while being extra data-efficient on sensorimotor training sequences and the small amount target demonstrations. PLEX, the PLanning-EXecution architecture we propose, is designed to take advantage of data sources of these types.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found