Reviews: Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task

Neural Information Processing Systems 

This paper presents an intriguing computational dissection of a particular form of reward rate underestimation in a bandit task (what the authors call as "pessimism bias"). Modeling suggests that this bias can be accounted for by a Bayesian model which assumes (erroneously) that reward rates are dynamic. The paper is well-written and the methods are sound. I think it could do a better job relating to previous literature, and there are some questions about the modeling and behavioral analysis which I detail below. Specific comments: I was surprised that there was no mention of Gershman & Niv (2015, Topics in Cognitive Science), which is one of the only papers I'm aware of that manipulates reward abundance.