Learning Optimal Strategies for Temporal Tasks in Stochastic Games

Bozkurt, Alper Kamil, Wang, Yu, Pajic, Miroslav

Feb-8-2021–arXiv.org Artificial Intelligence

Linear temporal logic (LTL) is widely used to formally specify complex tasks for autonomy. Unlike usual tasks defined by reward functions only, LTL tasks are noncumulative and require memory-dependent strategies. In this work, we introduce a method to learn optimal controller strategies that maximize the satisfaction probability of LTL specifications of the desired tasks in stochastic games, which are natural extensions of Markov Decision Processes (MDPs) to systems with adversarial inputs. Our approach constructs a product game using the deterministic automaton derived from the given LTL task and a reward machine based on the acceptance condition of the automaton; thus, allowing for the use of a model-free RL algorithm to learn an optimal controller strategy. Since the rewards and the transition probabilities of the reward machine do not depend on the number of sets defining the acceptance condition, our approach is scalable to a wide range of LTL tasks, as we demonstrate on several case studies.

probability, specification, transition, (16 more...)

arXiv.org Artificial Intelligence

Feb-8-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - North Carolina > Durham County
    - Durham (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > Los Angeles County
    - Los Angeles (0.14)
- Asia > Middle East
  - Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Representation & Reasoning (1.00)
  - Machine Learning > Reinforcement Learning (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found