Goto

Collaborating Authors

 test environment





Object-CategoryAwareReinforcementLearning

Neural Information Processing Systems

Reinforcement Learning (RL) has achievedimpressiveprogress inrecent years, such asresults in Atari [24] and Go [28] in which RL agents even perform better than human beings.







A Properties of coherent distortion risk measures

Neural Information Processing Systems

The properties of coherent risk measures also lead to a useful dual representation. Let ρ be a proper, real-valued coherent risk measure. See Shapiro et al. [42] for a general treatment of this result. Therefore, we have that the RAMU safe RL problem in (3) is equivalent to (6).B.3 Proof of Corollary 1 Fix ϵ > 0 and consider ( s, a) S A . Safety constraints and environment perturbations In all of our experiments, we consider the problem of optimizing a task objective while satisfying a safety constraint.