Projected Natural Actor-Critic

Thomas, Philip S., Dabney, William C., Giguere, Stephen, Mahadevan, Sridhar

Dec-31-2013–Neural Information Processing Systems

Natural actor-critics are a popular class of policy search algorithms for finding locally optimal policies for Markov decision processes. In this paper we address a drawback of natural actor-critics that limits their real-world applicability - their lack of safety guarantees. We present a principled algorithm for performing natural gradient descent over a constrained domain. In the context of reinforcement learning, this allows for natural actor-critic algorithms that are guaranteed to remain within a known safe region of policy space. While deriving our class of constrained natural actor-critic algorithms, which we call Projected Natural Actor-Critics (PNACs), we also elucidate the relationship between natural gradient descent and mirror descent.

algorithm, artificial intelligence, health & medicine, (15 more...)

Neural Information Processing Systems

Dec-31-2013

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Industry:
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)
    - Reinforcement Learning (0.68)
    - Statistical Learning > Gradient Descent (0.57)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
Projected Natural Actor-Critic

Similar Docs Excel Report more

Title	Similarity	Source
None found