Finding Safe Zones of policies Markov Decision Processes
Cohen, Lee, Mansour, Yishay, Moshkovitz, Michal
Given a policy of a Markov Decision Process, we define a SafeZone as a subset of states, such that most of the policy's trajectories are confined to this subset. The quality of a SafeZone is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. SafeZones are especially interesting when they have a small number of states and low escape probability. We study the complexity of finding optimal SafeZones, and show that in general, the problem is computationally hard. Our main result is a bi-criteria approximation learning algorithm with a factor of almost $2$ approximation for both the escape probability and SafeZone size, using a polynomial size sample complexity.
Oct-9-2023
- Country:
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Massachusetts > Middlesex County
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Israel > Tel Aviv District > Tel Aviv (0.04)
- North America > United States
- Genre:
- Research Report (0.50)
- Industry:
- Transportation (0.93)
- Government > Regional Government (0.46)