POMO: PolicyOptimizationwithMultipleOptima forReinforcementLearning

Neural Information Processing Systems 

We introduce Policy Optimization with Multiple Optima (POMO), anend-to-end approach forbuildingsuchaheuristic solver.POMO isapplicable to a wide range of CO problems.