Deterministic Policies for Constrained Reinforcement Learning in Polynomial-Time

May-23-2024–arXiv.org Artificial Intelligence

Constrained Reinforcement Learning (CRL) traditionally produces stochastic, expectationconstrained policies that can behave undesirably - imagine a self-driving car that randomly changes lanes or runs out of fuel. However, artificial decision-making systems must be predictable, trustworthy, and robust. One approach to ensuring these qualities is to focus on deterministic policies, which are inherently predictable and trustworthy. Moreover, they are easy to implement [10], reliable for autonomous vehicles [16, 12], and effective for multi-agent coordination [23]. Similarly, almost sure and anytime constraints [21] provide inherent trustworthiness and robustness, essential for applications in medicine [6, 22, 18], disaster relief [9, 29, 27], and resource management [20, 19, 24, 4]. Despite the advantages of deterministic policies and stricter constraints, their computation remains an open challenge in CRL. Our research aims to address this challenge by studying the computational complexity of computing deterministic policies for a wide range of constraint types. Consider a constrained Markov Decision Process (cMDP) denoted by M. Let C represent an arbitrary cost criterion and B be the available budget.

algorithm, constraint, induction hypothesis, (13 more...)

arXiv.org Artificial Intelligence

May-23-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County > New York City (0.04)
- Europe
  - Germany (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report (0.40)

Industry:
- Transportation (0.86)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Robots > Autonomous Vehicles (0.88)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found