The Value of Reward Lookahead in Reinforcement Learning

Neural Information Processing Systems 

In reinforcement learning (RL), agents sequentially interact with changing environments while aiming to maximize the obtained rewards.