Value Prediction Network

Oh, Junhyuk, Singh, Satinder, Lee, Honglak

Dec-31-2017–Neural Information Processing Systems

This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future values (discounted sum of rewards) rather than of future observations. Our experimental results show that VPN has several advantages over both model-free and model-based baselines in a stochastic environment where careful planning is required but building an accurate observation-prediction model is difficult. Furthermore, VPN outperforms Deep Q-Network (DQN) on several Atari games even with short-lookahead planning, demonstrating its potential as a new way of learning a good state representation.

computer game, deep learning, vpn, (20 more...)

Neural Information Processing Systems

Dec-31-2017

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.14)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Energy > Oil & Gas (0.68)
- Leisure & Entertainment > Games
  - Computer Games (0.57)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.46)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
Value Prediction Network
Value Prediction Network

Similar Docs Excel Report more

Title	Similarity	Source
None found