Reviews: Interval timing in deep reinforcement learning agents

Neural Information Processing Systems 

After reading the Author Feedback: The authors addressed and responded to all my concerns in an extensive manner. This is an interesting well-thought contribution, and I am happy to increase my score. Summary: In this paper, the authors investigate how deep reinforcement learning agents with distinct architectures (mainly, feed-forward vs. recurrent) learn to solve an interval timing task analogous to a time reproduction task widely used in the human timing literature, implemented in a virtual psychophysics lab (PsychLab/DeepMind lab). Briefly, in each trial the agent has to measure the time interval between a "ready" and "set" cue, and wait for the same duration before responding by moving their virtual gaze inside a "go" target; with the goal that the duration between the "set" cue and the "go" response should match the interval between "ready" and "set". Time intervals during training are drawn from a discrete uniform distribution.