Confident Natural Policy Gradient for Local Planning in q

May-25-2025, 08:41:22 GMT–Neural Information Processing Systems

The constrained Markov decision process (CMDP) framework emerges as an important reinforcement learning approach for imposing safety or other critical objectives while maximizing cumulative reward. However, the current understanding of how to learn efficiently in a CMDP environment with a potentially infinite number of states remains under investigation, particularly when function approximation is applied to the value functions.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

May-25-2025, 08:41:22 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom
  - England (0.14)
- North America
  - Canada > Alberta (0.14)
  - United States > California (0.14)

Genre:
- Research Report > Experimental Study (0.92)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.45)
    - Reinforcement Learning (0.86)
  - Representation & Reasoning (1.00)