Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning

Apr-30-2023–arXiv.org Artificial Intelligence

Another direction in safe RL is risksensitive RL has emerged as a powerful computational approach for RL, which aims to balance the trade-off between training agents to achieve complex objectives through interactions exploration, exploitation, and risk management (Mihatsch within stochastic environments (Sutton and Barto and Neuneier 2002). Risk-sensitive RL algorithms incorporate 2018). RL algorithms have demonstrated significant success risk measures, such as conditional value-at-risk (CVaR) in a wide range of applications and domains (Singh, (Tamar, Glassner, and Mannor 2014) and risk envelope (Majumdar Kumar, and Singh 2022; Razzaghi et al. 2022). However, et al. 2017), to guide the learning process. An additional when deploying RL policies in real-world scenarios, particularly approach to ensure safety in RL is through shielding, those involving safety-critical operations, ensuring the which intervenes in the agent's actions when it might violate safety of the learning process becomes a paramount concern.

constraint, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Apr-30-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Industry:
- Energy > Oil & Gas
  - Upstream (0.34)
- Information Technology (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found