inverse model
Learning to Poke by Poking: Experiential Learning of Intuitive Physics
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics. Our model is evaluated on a real-world robotic manipulation task that requires displacing objects to target locations by poking. The robot gathered over 400 hours of experience by executing more than 50K pokes on different objects. We propose a novel approach based on deep neural networks for modeling the dynamics of robot's interactions directly from images, by jointly estimating forward and inverse models of dynamics. The inverse model objective provides supervision to construct informative visual features, which the forward model can then predict and in turn regularize the feature space for the inverse model. The interplay between these two objectives creates useful, accurate models that can then be used for multi-step decision making. This formulation has the additional benefit that it is possible to learn forward models in an abstract feature space and thus alleviate the need of predicting pixels. Our experiments show that this joint modeling approach outperforms alternative methods. We also demonstrate that active data collection using the learned model further improves performance.
- North America > Canada > Alberta (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > Virginia (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Robots (0.93)
- Information Technology > Artificial Intelligence > Vision (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
cf708fc1decf0337aded484f8f4519ae-Supplemental.pdf
We found that training an inverse model is crucial for learning good representations. On the first row,alevel from each environment that one-shot PPGS fails tosolve(thewhitearrowsrepresent thepolicy). Iterative Model Improvement In general settings, collecting training trajectories by sampling actions uniformly atrandom does not grant sufficient coverage ofthe state space. GLAMORGLAMOR [34] learns inverse dynamics to achieve visual goals in Atari games. The only difference withPPGS in terms of settings is that we allowGLAMORto collect data on-policy and for more interactions (2M).
- Europe > Austria (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (3 more...)
- North America > United States > North Carolina > Durham County > Durham (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)