Risk-Averse Reinforcement Learning: An Optimal Transport Perspective on Temporal Difference Learning

Open in new window