Imitation-Projected Programmatic Reinforcement Learning

Abhinav Verma, Hoang Le, Yisong Yue, Swarat Chaudhuri

Neural Information Processing Systems 

We study the problem of programmatic reinforcement learning, in which policies are represented as short programs in a symbolic language. Programmatic policies can be more interpretable, generalizable, and amenable to formal verification than neural policies; however, designing rigorous learning approaches for such policies remains a challenge.