Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Schrittwieser, Julian, Antonoglou, Ioannis, Hubert, Thomas, Simonyan, Karen, Sifre, Laurent, Schmitt, Simon, Guez, Arthur, Lockhart, Edward, Hassabis, Demis, Graepel, Thore, Lillicrap, Timothy, Silver, David
Planning algorithms based on lookahead search have achieved remarkable successes in artificial intelligence. Human world champions have been defeated in classic games such as checkers [34], chess [5], Go [38] and poker [3, 26], and planning algorithms have had real-world impact in applications from logistics [47] to chemical synthesis [37]. However, these planning algorithms all rely on knowledge of the environment's dynamics, such as the rules of the game or an accurate simulator, preventing their direct application to real-world domains like robotics, industrial control, or intelligent assistants. Model-based reinforcement learning (RL) [42] aims to address this issue by first learning a model of the environment's dynamics, and then planning with respect to the learned model. Typically, these models have either focused on reconstructing the true environmental state [8, 16, 24], or the sequence of full observations [14, 20]. However, prior work [4, 14, 20] remains far from the state of the art in visually rich domains, such as Atari 2600 games [2]. Instead, the most successful methods are based on model-free RL [9, 21, 18] - i.e. they estimate the optimal policy and/or value function directly from interactions with the environment. However, model-free algorithms are in turn far from the state of the art in domains that require precise and sophisticated lookahead, such as chess and Go. In this paper, we introduce MuZero, a new approach to model-based RL that achieves state-of-the-art performance in Atari 2600, a visually complex set of domains, while maintaining superhuman performance in precision planning tasks such as chess, shogi and Go.
Nov-19-2019
- Country:
- North America
- Puerto Rico (0.04)
- United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Massachusetts > Middlesex County
- Europe
- United Kingdom (0.04)
- Greece (0.04)
- North America
- Genre:
- Research Report (1.00)
- Workflow (0.93)
- Industry:
- Leisure & Entertainment > Games
- Chess (1.00)
- Computer Games (0.94)
- Leisure & Entertainment > Games
- Technology: