Recurrent World Models Facilitate Policy Evolution
Ha, David, Schmidhuber, Jürgen
–Neural Information Processing Systems
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations. The world model's extracted features are fed into compact and simple policies trained by evolution, achieving state of the art results in various environments. We also train our agent entirely inside of an environment generated by its own internal world model, and transfer this policy back into the actual environment. Interactive version of this paper is available at https://worldmodels.github.io
Neural Information Processing Systems
Dec-31-2018
- Country:
- Asia
- Europe
- Finland > Uusimaa
- Helsinki (0.04)
- Germany > North Rhine-Westphalia
- Upper Bavaria > Munich (0.04)
- Greece (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Finland > Uusimaa
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Massachusetts > Middlesex County
- Canada > Quebec
- Industry:
- Health & Medicine > Therapeutic Area
- Neurology (0.68)
- Leisure & Entertainment
- Games > Computer Games (1.00)
- Sports (0.67)
- Health & Medicine > Therapeutic Area
- Technology: