Semi-Infinitely Constrained Markov Decision Processes Yang Peng Academy of Advanced Interdisciplinary Studies School of Mathematical Sciences Peking University
–Neural Information Processing Systems
We propose a generalization of constrained Markov decision processes (CMDPs) that we call the semi-infinitely constrained Markov decision process (SICMDP). Particularly, we consider a continuum of constraints instead of a finite number of constraints as in the case of ordinary CMDPs. We also devise a reinforcement learning algorithm for SICMDPs that we call SI-CRL. We first transform the reinforcement learning problem into a linear semi-infinitely programming (LSIP) problem and then use the dual exchange method in the LSIP literature to solve it. To the best of our knowledge, we are the first to apply tools from semi-infinitely programming (SIP) to solve constrained reinforcement learning problems.
Neural Information Processing Systems
Mar-23-2025, 20:39:56 GMT
- Country:
- North America > United States (0.68)
- Genre:
- Research Report (0.68)
- Industry:
- Education > Focused Education
- Special Education (0.45)
- Energy (0.68)
- Education > Focused Education