VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors

Zhu, Yifeng, Joshi, Abhishek, Stone, Peter, Zhu, Yuke

Mar-8-2023–arXiv.org Artificial Intelligence

We introduce VIOLA, an object-centric imitation learning approach to learning closed-loop visuomotor policies for robot manipulation. Our approach constructs object-centric representations based on general object proposals from a pre-trained vision model. VIOLA uses a transformer-based policy to reason over these representations and attend to the task-relevant visual factors for action prediction. Such object-based structural priors improve deep imitation learning algorithm's robustness against object variations and environmental perturbations. We quantitatively evaluate VIOLA in simulation and on real robots. VIOLA outperforms the state-of-the-art imitation learning methods by $45.8\%$ in success rate. It has also been deployed successfully on a physical robot to solve challenging long-horizon tasks, such as dining table arrangement and coffee making. More videos and model details can be found in supplementary material and the project website: https://ut-austin-rpl.github.io/VIOLA .

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Mar-8-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > New Zealand
  - North Island > Auckland Region > Auckland (0.04)
- North America > United States
  - Texas > Travis County > Austin (0.04)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found