POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

Dec-24-2025, 21:14:18 GMT–Neural Information Processing Systems

In neural combinatorial optimization (CO), reinforcement learning (RL) can turn a deep neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a great potential in practical applications because it allows near-optimal solutions to be found without expert guides armed with substantial domain knowledge. We introduce Policy Optimization with Multiple Optima (POMO), an end-to-end approach for building such a heuristic solver. POMO is applicable to a wide range of CO problems. It is designed to exploit the symmetries in the representation of a CO solution.

multiple optima, policy optimization, pomo, (6 more...)

Neural Information Processing Systems

Dec-24-2025, 21:14:18 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)