Sample-Efficient Deep RL with Generative Adversarial Tree Search

Azizzadenesheli, Kamyar, Yang, Brandon, Liu, Weitang, Brunskill, Emma, Lipton, Zachary C, Anandkumar, Animashree

Jun-14-2018–arXiv.org Artificial Intelligence

We propose Generative Adversarial Tree Search (GATS), a sample-efficient Deep Reinforcement Learning (DRL) algorithm. While Monte Carlo Tree Search (MCTS) is known to be effective for search and planning in RL, it is often sample-inefficient and therefore expensive to apply in practice. In this work, we develop a Generative Adversarial Network (GAN) architecture to model an environment's dynamics and a predictor model for the reward function. We exploit collected data from interaction with the environment to learn these models, which we then use for model-based planning. During planning, we deploy a finite depth MCTS, using the learned model for tree search and a learned Q-value for the leaves, to find the best action. We theoretically show that GATS improves the bias-variance trade-off in value-based DRL. Moreover, we show that the generative model learns the model dynamics using orders of magnitude fewer samples than the Q-learner. In non-stationary settings where the environment model changes, we find the generative model adapts significantly faster than the Q-learner to the new environment.

computer game, gdm, neural network, (18 more...)

arXiv.org Artificial Intelligence

Jun-14-2018

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Reinforcement Learning (0.71)
  - Representation & Reasoning > Search (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found