An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient Yudong Luo

Neural Information Processing Systems 

Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return variance. Recent methods restrict the per-step reward variance as a proxy.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found