Reinforcement Learning
HindsightCreditAssignment
A reinforcement learning (RL) agent is tasked with two fundamental, interdependent problems: exploration(howtodiscoverusefuldata),andcreditassignment(howtoincorporateit). The simplest way of estimating the value function is by averaging returns (futurediscountedsumsofrewards)startingfromtaking ainx.
e140dbab44e01e699491a59c9978b924-Paper.pdf
Success stories of deep reinforcement learning (RL) from high dimensional inputs such as pixels or large spatial layouts include achieving superhuman performance on Atari games [30, 37, 1], grandmaster levelinStarcraft II[50]andgrasping adiverse setofobjects with impressivesuccess rates and generalization with robots in the real world [21].