Finding Safe Zones of policies Markov Decision Processes
Cohen, Lee, Mansour, Yishay, Moshkovitz, Michal
Given a policy of a Markov Decision Process, we define a SafeZone as a subset of states, such that most of the policy's trajectories are confined to this subset. The quality of a SafeZone is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. SafeZones are especially interesting when they have a small number of states and low escape probability. We study the complexity of finding optimal SafeZones, and show that in general, the problem is computationally hard. Our main result is a bi-criteria approximation learning algorithm with a factor of almost $2$ approximation for both the escape probability and SafeZone size, using a polynomial size sample complexity.
Oct-9-2023
- Country:
- North America > United States (0.67)
- Genre:
- Research Report (0.50)
- Industry:
- Government > Regional Government (0.46)
- Transportation (0.93)