Goto

Collaborating Authors

 Optimization









An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient Yudong Luo

Neural Information Processing Systems

Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return variance. Recent methods restrict the per-step reward variance as a proxy.