Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing

Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc V. Le, Ni Lao

Neural Information Processing Systems 

MAPO improves the sample efficiency and robustness of policy gradient, especially on tasks with sparse rewards.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found