Program Synthesis Through Reinforcement Learning Guided Tree Search
Simmons-Edler, Riley, Miltner, Anders, Seung, Sebastian
–arXiv.org Artificial Intelligence
Program Synthesis is the task of generating a program from a provided specification. Traditionally, this has been treated as a search problem by the programming languages (PL) community and more recently as a supervised learning problem by the machine learning community. Here, we propose a third approach, representing the task of synthesizing a given program as a Markov decision process solvable via reinforcement learning(RL). From observations about the states of partial programs, we attempt to find a program that is optimal over a provided reward metric on pairs of programs and states. We instantiate this approach on a subset of the RISC-V assembly language operating on floating point numbers, and as an optimization inspired by search-based techniques from the PL community, we combine RL with a priority search tree. We evaluate this instantiation and demonstrate the effectiveness of our combined method compared to a variety of baselines, including a pure RL ablation and a state of the art Markov chain Monte Carlo search method on this task.
arXiv.org Artificial Intelligence
Jun-7-2018
- Country:
- North America > United States
- California > Alameda County
- Berkeley (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New Jersey > Mercer County
- Princeton (0.04)
- New York > New York County
- New York City (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- California > Alameda County
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: