Offline Learning of Counterfactual Perception as Prediction for Real-World Robotic Reinforcement Learning
Jin, Jun, Graves, Daniel, Haigh, Cameron, Luo, Jun, Jagersand, Martin
–arXiv.org Artificial Intelligence
We propose a method for offline learning of counterfactual predictions to address real world robotic reinforcement learning challenges. The proposed method encodes action-oriented visual observations as several "what if" questions learned offline from prior experience using reinforcement learning methods. These "what if" questions counterfactually predict how action-conditioned observation would evolve on multiple temporal scales if the agent were to stick to its current action. We show that combining these offline counterfactual predictions along with online in-situ observations (e.g. force feedback) allows efficient policy learning with only a sparse terminal (success/failure) reward. We argue that the learned predictions form an effective representation of the visual task, and guide the online exploration towards high-potential success interactions (e.g. contact-rich regions). Experiments were conducted in both simulation and real-world scenarios for evaluation. Our results demonstrate that it is practical to train a reinforcement learning agent to perform real-world fine manipulation in about half a day, without hand engineered perception systems or calibrated instrumentation. Recordings of the real robot training can be found via https://sites.google.com/view/realrl.
arXiv.org Artificial Intelligence
Nov-11-2020
- Country:
- North America > Canada > Alberta (0.28)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Education > Educational Setting
- Online (0.33)
- Leisure & Entertainment (0.46)
- Education > Educational Setting
- Technology: