Robustness and risk-sensitivity in Markov decision processes
–Neural Information Processing Systems
We uncover relations between robust MDPs and risk-sensitive MDPs. The objective of a robust MDP is to minimize a function, such as the expectation of cumulative cost, for the worst case when the parameters have uncertainties. The objective of a risk-sensitive MDP is to minimize a risk measure of the cumulative cost when the parameters are known. We show that a risk-sensitive MDP of minimizing the expected exponential utility is equivalent to a robust MDP of minimizing the worst-case expectation with a penalty for the deviation of the uncertain parameters from their nominal values, which is measured with the Kullback-Leibler divergence. We also show that a risk-sensitive MDP of minimizing an iterated risk measure that is composed of certain coherent risk measures is equivalent to a robust MDP of minimizing the worst-case expectation when the possible deviations of uncertain parameters from their nominal values are characterized with a concave function.
Neural Information Processing Systems
Mar-14-2024, 17:43:33 GMT
- Country:
- North America > United States
- New Jersey > Hudson County
- Hoboken (0.14)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New Jersey > Hudson County
- Europe > Germany
- Berlin (0.04)
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States