Reviews: Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning
–Neural Information Processing Systems
This paper focuses on the problem arising from skewness in the distribution of value estimates, which may result in over- or under-estimation. With careful analysis, the paper shows that a particular model-based value estimate is approximately log-normally distributed, which is skewed and thus leading to the possibility of over- or under-estimation. It is further shown that positive and negative rewards induce opposite sort of skewness. With simple experiments, the problem of over/underestimation is illustrated. This is an interesting paper with some interesting insights on over/underestimation of values.
Neural Information Processing Systems
Oct-7-2024, 23:22:47 GMT
- Technology: