DeepSynth: Program Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning
Hasanbeig, Mohammadhosein, Jeppu, Natasha Yogananda, Abate, Alessandro, Melham, Tom, Kroening, Daniel
–arXiv.org Artificial Intelligence
We propose a method for efficient training of deep Reinforcement Learning (RL) agents when the reward is highly sparse and non-Markovian, but at the same time admits a high-level yet unknown sequential structure, as seen in a number of video games. This high-level sequential structure can be expressed as a computer program, which our method infers automatically as the RL agent explores the environment. Through this process, a high-level sequential task that occurs only rarely may nonetheless be encoded within the inferred program. A hybrid architecture for deep neural fitted Q-iteration is then employed to fill in low-level details and generate an optimal control policy that follows the structure of the program. Our experiments show that the agent is able to synthesise a complex program to guide the RL exploitation phase, which is otherwise difficult to achieve with state-of-the-art RL techniques.
arXiv.org Artificial Intelligence
Nov-22-2019
- Country:
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- North America > United States
- Massachusetts > Hampshire County > Amherst (0.04)
- Europe > United Kingdom
- Genre:
- Research Report (0.40)
- Industry:
- Leisure & Entertainment > Games > Computer Games (1.00)
- Technology: