Learning from Human Preferences

Jun-14-2017, 15:50:21 GMT–#artificialintelligence

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind's safety team, we've developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better. We present a learning algorithm that uses small amounts of human feedback to solve modern RL environments. Machine learning systems with human feedback have been explored before, but we've scaled up the approach to be able to work on much more complicated tasks. Our algorithm needed 900 bits of feedback from a human evaluator to learn to backflip -- a seemingly simple task which is simple to judge but challenging to specify.

human feedback, large language model, machine learning, (20 more...)

#artificialintelligence

Jun-14-2017, 15:50:21 GMT

News Web Page

Add feedback

Industry:
- Leisure & Entertainment > Games (0.32)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.73)
  - Machine Learning
    - Reinforcement Learning (0.87)
    - Neural Networks > Deep Learning
      - Generative AI (0.41)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found