Distributed Deep Q-Learning
Ong, Hao Yi, Chavez, Kevin, Hong, Augustus
–arXiv.org Artificial Intelligence
Reinforcement learning (RL) agents face a tremendous challenge in optimizing their control of a system approaching real-world complexity: they must derive efficient representations of the environment from high-dimensional sensory inputs and use these to generalize past experience to new situations. While past work in RL has shown that with good handcrafted features agents are able to learn good control policies, their applicability has been limited to domains where such features have been discovered, or to domains with fully observed, low-dimensional state spaces [1]-[3]. We consider the problem of efficiently scaling a deep learning algorithm to control a complicated system with high-dimensional sensory inputs. The basis of our algorithm is a RL agent called a deep Q-network (DQN) [4], [5] that combines RL with a class of artificial neural networks known as deep neural networks [6]. DQN uses an architecture called the deep convolutional network, which utilizes hierarchical layers of tiled convolutional filters to exploit the local spatial correlations present in images. As a result, this architecture is robust to natural transformations such as changes of viewpoint and scale [7]. In practice, increasing the scale of deep learning with respect to the number of training examples or the number of model parameters can drastically improve the performance of deep neural networks [8], [9]. To train a deep network with many parameters on multiple machines efficiently, we adapt a software framework called DistBelief to the context of the training of RL agents [10].
arXiv.org Artificial Intelligence
Oct-15-2015
- Country:
- Genre:
- Research Report (0.64)
- Industry:
- Leisure & Entertainment > Games (1.00)
- Technology: