Finding Safe Zones of policies Markov Decision Processes

Cohen, Lee, Mansour, Yishay, Moshkovitz, Michal

Oct-9-2023–arXiv.org Machine Learning

Given a policy of a Markov Decision Process, we define a SafeZone as a subset of states, such that most of the policy's trajectories are confined to this subset. The quality of a SafeZone is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. SafeZones are especially interesting when they have a small number of states and low escape probability. We study the complexity of finding optimal SafeZones, and show that in general, the problem is computationally hard. Our main result is a bi-criteria approximation learning algorithm with a factor of almost $2$ approximation for both the escape probability and SafeZone size, using a polynomial size sample complexity.

artificial intelligence, machine learning, probability, (17 more...)

arXiv.org Machine Learning

Oct-9-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - Illinois > Cook County
    - Chicago (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:
- Research Report (0.50)

Industry:
- Transportation (0.93)
- Government > Regional Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (0.92)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found