Projected Natural Actor-Critic

Mar-13-2024, 21:24:22 GMT–Neural Information Processing Systems

In this paper we address a drawback of natural actor-critics that limits their real-world applicability--their lack of safety guarantees. We present a principled algorithm for performing natural gradient descent over a constrained domain. In the context of reinforcement learning, this allows for natural actor-critic algorithms that are guaranteed to remain within a known safe region of policy space. While deriving our class of constrained natural actor-critic algorithms, which we call Projected Natural Actor-Critics (PNACs), we also elucidate the relationship between natural gradient descent and mirror descent.

algorithm, descent, projection, (10 more...)

Neural Information Processing Systems

Mar-13-2024, 21:24:22 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - California (0.04)
  - Pennsylvania > Philadelphia County
    - Philadelphia (0.04)
  - New Jersey > Mercer County
    - Princeton (0.04)
  - Massachusetts > Hampshire County
    - Amherst (0.14)

Industry:
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (0.68)
    - Statistical Learning > Gradient Descent (0.57)