Action Guidance with MCTS for Deep Reinforcement Learning

Kartal, Bilal, Hernandez-Leal, Pablo, Taylor, Matthew E.

Jul-25-2019–arXiv.org Machine Learning

Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. Introduction Deep reinforcement learning (DRL) has enabled better scalability and generalization for challenging domains (Arulku-maran et al. 2017; Li 2017; Hernandez-Leal, Kartal, and Taylor 2018) such as Atari games (Mnih et al. 2015), Go (Silver et al. 2016) and multiagent games (e.g., Starcraft II and DOT A 2) (OpenAI 2018). However, one of the current biggest challenges for DRL is sample efficiency (Y u 2018). On the one hand, once a DRL agent is trained, it can be deployed to act in real-time by only performing an inference through the trained model. On the other hand, planning methods such as Monte Carlo tree search (MCTS) (Browne et al. 2012) do not have a training phase, but they perform computationally costly simulation based rollouts (assuming access to a simulator) to find the best action to take. There are several ways to get the best of both DRL and search methods.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

Jul-25-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment > Games > Computer Games (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found