Offline Learning of Counterfactual Perception as Prediction for Real-World Robotic Reinforcement Learning

Jin, Jun, Graves, Daniel, Haigh, Cameron, Luo, Jun, Jagersand, Martin

Nov-11-2020–arXiv.org Artificial Intelligence

We propose a method for offline learning of counterfactual predictions to address real world robotic reinforcement learning challenges. The proposed method encodes action-oriented visual observations as several "what if" questions learned offline from prior experience using reinforcement learning methods. These "what if" questions counterfactually predict how action-conditioned observation would evolve on multiple temporal scales if the agent were to stick to its current action. We show that combining these offline counterfactual predictions along with online in-situ observations (e.g. force feedback) allows efficient policy learning with only a sparse terminal (success/failure) reward. We argue that the learned predictions form an effective representation of the visual task, and guide the online exploration towards high-potential success interactions (e.g. contact-rich regions). Experiments were conducted in both simulation and real-world scenarios for evaluation. Our results demonstrate that it is practical to train a reinforcement learning agent to perform real-world fine manipulation in about half a day, without hand engineered perception systems or calibrated instrumentation. Recordings of the real robot training can be found via https://sites.google.com/view/realrl.

artificial intelligence, counterfactual prediction, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Nov-11-2020

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Alberta (0.28)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Education > Educational Setting
  - Online (0.33)
- Leisure & Entertainment (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (1.00)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found