Transformer-based World Models Are Happy With 100k Interactions
Robine, Jan, Höftmann, Marc, Uelwer, Tobias, Harmeling, Stefan
–arXiv.org Artificial Intelligence
Deep neural networks have been successful in many reinforcement learning settings. However, compared to human learners they are overly data hungry. To build a sample-efficient world model, we apply a transformer to real-world episodes in an autoregressive manner: not only the compact latent states and the taken actions but also the experienced or predicted rewards are fed into the transformer, so that it can attend flexibly to all three modalities at different time steps. The transformer allows our world model to access previous states directly, instead of viewing them through a compressed recurrent state. By utilizing the Transformer-XL architecture, it is able to learn long-term dependencies while staying computationally efficient. Our transformer-based world model (TWM) generates meaningful, new experience, which is used to train a policy that outperforms previous model-free and model-based reinforcement learning algorithms on the Atari 100k benchmark.
arXiv.org Artificial Intelligence
Mar-13-2023
- Country:
- North America
- United States
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Los Angeles County
- Long Beach (0.14)
- New York
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Alberta > Census Division No. 15
- Improvement District No. 9 > Banff (0.04)
- United States
- Europe
- Austria (0.04)
- France (0.04)
- Italy
- Tuscany > Florence (0.04)
- Piedmont > Turin Province
- Turin (0.04)
- Germany > North Rhine-Westphalia
- Arnsberg Region > Dortmund (0.04)
- Asia > Middle East
- Jordan (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America
- Genre:
- Research Report (0.50)
- Industry:
- Leisure & Entertainment > Games (1.00)