Reinforcement Learning with Parameterized Actions

Masson, Warwick, Ranchod, Pravesh, Konidaris, George

Nov-26-2015–arXiv.org Artificial Intelligence

We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions--discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goalscoring and Platform domains.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Nov-26-2015

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)
- Africa (0.28)

Genre:
- Research Report (0.40)

Industry:
- Leisure & Entertainment > Sports > Soccer (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.94)
  - Machine Learning
    - Reinforcement Learning (0.86)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found