Memory Augmented Policy Optimization for Program Synthesis with Generalization