Reward learning from human preferences and demonstrations in Atari

Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

Nov-20-2025, 18:11:59 GMT–Neural Information Processing Systems

To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions.

demonstration, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Nov-20-2025, 18:11:59 GMT

Conferences PDF

Country:
- North America > Canada > Quebec > Montreal (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Leisure & Entertainment > Games (1.00)
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.69)

Duplicate Docs Excel Report

Title
Reward learning from human preferences and demonstrations in Atari
Reward learning from human preferences and demonstrations in Atari

Similar Docs Excel Report more

Title	Similarity	Source
None found