Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning

Bozkurt, Alper Kamil, Wang, Yu, Zavlanos, Michael M., Pajic, Miroslav

Sep-16-2019–arXiv.org Artificial Intelligence

Arrows: actions top, left, down, and right; encircled characters: state labels. The actions in states that are not reachable or lead to another LDBA state are not displayed. In all subfigures, the most likely paths are highlighted in red. the baby b, the only allowed action is left and when taken the following situations can happen: (i) the robot hits the wall with probability 0.1 and wakes the baby up; (ii) the robot moves left with probability 0. 8 or moves down with probability 0.1 . If the baby has been woken up, which means the robot could not leave in a single time step (represented by L TL as b null b), the robot should notify the adult (at state a); otherwise, the robot should directly go back to the charger (at state c). The full objective is specified in L TL as ϕ 2 nullnull d nullnullnullnull (1) (b null b) null ( b U (a c)) null nullnull null (2) a null ( a U b) null nullnull null (3) ( b null b nullnull b) ( a U c) null nullnull null (4) c ( a U b) null nullnull null (5) (b null b) a null nullnull null (6) null .

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Sep-16-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.29)
- Asia > Middle East (0.28)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found