Methods discounted by the switching costs c - see below) and the latest action a

Neural Information Processing Systems 

Noisy reward depletion was applied to the noisy values of previous rewards (Equation 2).