Reward learning from human preferences and demonstrations in Atari

Open in new window