Entropic Risk Constrained Soft-Robust Policy Optimization

Russel, Reazul Hasan, Behzadian, Bahram, Petrik, Marek

Jun-20-2020–arXiv.org Machine Learning

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties. In this paper, we propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty. We demonstrate the usefulness of our algorithms on several problem domains.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

Jun-20-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > New Hampshire (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.48)
  - Machine Learning > Reinforcement Learning (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found