Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes

Feb-8-2026, 14:45:34 GMT–Neural Information Processing Systems

We study sequential decision-making problems in which each agent aims to maximize the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon Constrained Markov Decision Processes (CMDPs) problem. Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method for CMDPs which updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Feb-8-2026, 14:45:34 GMT

Conferences PDF

Add feedback

Country:
- North America
  - Canada (0.04)
  - United States
    - California (0.14)
    - Illinois (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Industry:
- Health & Medicine (0.93)
- Government (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Reinforcement Learning (0.90)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.61)

Duplicate Docs Excel Report

Title
5f7695debd8cde8db5abcb9f161b49ea-Paper.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found