Reward learning from human preferences and demonstrations in Atari

Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

Neural Information Processing Systems 

To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found