POMO: PolicyOptimizationwithMultipleOptima forReinforcementLearning
–Neural Information Processing Systems
We introduce Policy Optimization with Multiple Optima (POMO), anend-to-end approach forbuildingsuchaheuristic solver.POMO isapplicable to a wide range of CO problems.
Neural Information Processing Systems
Feb-11-2026, 02:21:27 GMT
- Country:
- Technology: