An overview of 11 proposals for building safe advanced AI

Dec-4-2020–arXiv.org Artificial Intelligence

This paper analyzes and compares 11 different proposals for building safe advanced AI under the current machine learning paradigm, including major contenders such as iterated amplification, AI safety via debate, and recursive reward modeling. Each proposal is evaluated on the four components of outer alignment, inner alignment, training competitiveness, and performance competitiveness, of which the distinction between the latter two is introduced in this paper. While prior literature has primarily focused on analyzing individual proposals, or primarily focused on outer alignment at the expense of inner alignment, this analysis seeks to take a comparative look at a wide range of proposals including a comparative analysis across all four previously mentioned components.

amplification, competitiveness, reward modeling, (16 more...)

arXiv.org Artificial Intelligence

Dec-4-2020

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:
- Research Report (0.90)

Industry:
- Leisure & Entertainment > Games (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.94)
  - Representation & Reasoning > Agents (0.46)
  - Machine Learning
    - Reinforcement Learning (0.68)
    - Neural Networks (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found