Constrained Reinforcement Learning Has Zero Duality Gap
Paternain, Santiago, Chamon, Luiz, Calvo-Fullana, Miguel, Ribeiro, Alejandro
–Neural Information Processing Systems
Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning (RL), these problems are addressed by (i) designing a reward function that simultaneously describes all requirements or (ii) combining modular value functions that encode them individually. Though effective, these methods have critical downsides. Designing good reward functions that balance different objectives is challenging, especially as the number of objectives grows. Moreover, implicit interference between goals may lead to performance plateaus as they compete for resources, particularly when training on-policy.
Neural Information Processing Systems
Mar-18-2020, 23:33:20 GMT
- Technology: