Flipping-based Policy for Chance-Constrained Markov Decision Processes
–Neural Information Processing Systems
Safe reinforcement learning (RL) is a promising approach for many real-world decision-making problems where ensuring safety is a critical necessity. In safe RL research, while expected cumulative safety constraints (ECSCs) are typically the first choices, chance constraints are often more pragmatic for incorporating safety under uncertainties. This paper proposes a flipping-based policy for Chance-Constrained Markov Decision Processes (CCMDPs). The flipping-based policy selects the next action by tossing a potentially distorted coin between two action candidates. The probability of the flip and the two action candidates vary depending on the state.
Neural Information Processing Systems
May-30-2025, 02:41:55 GMT
- Country:
- North America > United States > Massachusetts > Middlesex County (0.14)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.92)
- Research Report