Goto

Collaborating Authors

 Brisbane











UnpackingRewardShaping

Neural Information Processing Systems

Much of this work is based on upper confidence bound (UCB) principles and prescribes some kind of exploration bonus to prioritize exploration of rarely visited regions.