Reward Is Not Enough for Risk-Averse Reinforcement Learning

#artificialintelligence 

TL;DR: Risk-aversion is essential in many RL applications (e.g., driving, robotic surgery and finance). Some modified RL frameworks consider risk (e.g., by optimizing a risk-measure of the return instead of its expectation), but pose new algorithmic challenges. Instead, it is often suggested to stick with the old and good RL framework, and just set the rewards such that negative outcomes are amplified. Unfortunately, as discussed below, modeling risk using expectation over redefined rewards is often unnatural, impractical or even mathematically impossible, hence cannot replace explicit optimization of risk-measures. This is consistent with similar results from decision theory, where risk optimization is not equivalent to expected utility maximization.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found