Low impact agency: review and discussion

Mar-6-2023–arXiv.org Artificial Intelligence

The problem of artificial intelligence safety can be seen as can be seen as ensuring an agent with the power of causing harm chooses to not do so. In the limit, the agent can be powerful enough that causing existential catastrophe is within its limit, and it has incentives to doing so [6], so our task is to guarantee that it chooses not to. A possible approach is penalize changes in the world caused by agent, leading to the agent not causing catastrophe because that leads to large changes in the world[24]. The hope is that this is a relatively easy objective to align the agent with, as opposed to aligning it with the full range of human values. So, our desideratum is that the AI achieves something while doing as little in the world as possible .

agent, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Mar-6-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Bangladesh
  - Dhaka Division > Dhaka District > Dhaka (0.04)
- Europe
  - Poland (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.14)
- North America > United States
  - California > Santa Clara County
    - Palo Alto (0.04)
  - New York > New York County
    - New York City (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found