Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Schrittwieser, Julian, Antonoglou, Ioannis, Hubert, Thomas, Simonyan, Karen, Sifre, Laurent, Schmitt, Simon, Guez, Arthur, Lockhart, Edward, Hassabis, Demis, Graepel, Thore, Lillicrap, Timothy, Silver, David

Nov-19-2019–arXiv.org Machine Learning

Planning algorithms based on lookahead search have achieved remarkable successes in artificial intelligence. Human world champions have been defeated in classic games such as checkers [34], chess [5], Go [38] and poker [3, 26], and planning algorithms have had real-world impact in applications from logistics [47] to chemical synthesis [37]. However, these planning algorithms all rely on knowledge of the environment's dynamics, such as the rules of the game or an accurate simulator, preventing their direct application to real-world domains like robotics, industrial control, or intelligent assistants. Model-based reinforcement learning (RL) [42] aims to address this issue by first learning a model of the environment's dynamics, and then planning with respect to the learned model. Typically, these models have either focused on reconstructing the true environmental state [8, 16, 24], or the sequence of full observations [14, 20]. However, prior work [4, 14, 20] remains far from the state of the art in visually rich domains, such as Atari 2600 games [2]. Instead, the most successful methods are based on model-free RL [9, 21, 18] - i.e. they estimate the optimal policy and/or value function directly from interactions with the environment. However, model-free algorithms are in turn far from the state of the art in domains that require precise and sophisticated lookahead, such as chess and Go. In this paper, we introduce MuZero, a new approach to model-based RL that achieves state-of-the-art performance in Atari 2600, a visually complex set of domains, while maintaining superhuman performance in precision planning tasks such as chess, shogi and Go.

algorithm, muzero, simulation, (15 more...)

arXiv.org Machine Learning

Nov-19-2019

arXiv.org PDF

Add feedback

Country:
- North America
  - Puerto Rico (0.04)
  - United States
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - California > Los Angeles County
      - Long Beach (0.04)
- Europe
  - United Kingdom (0.04)
  - Greece (0.04)

Genre:
- Research Report (1.00)
- Workflow (0.93)

Industry:
- Leisure & Entertainment > Games
  - Chess (1.00)
  - Computer Games (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Planning & Scheduling (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found