Low impact agency: review and discussion
–arXiv.org Artificial Intelligence
The problem of artificial intelligence safety can be seen as can be seen as ensuring an agent with the power of causing harm chooses to not do so. In the limit, the agent can be powerful enough that causing existential catastrophe is within its limit, and it has incentives to doing so [6], so our task is to guarantee that it chooses not to. A possible approach is penalize changes in the world caused by agent, leading to the agent not causing catastrophe because that leads to large changes in the world[24]. The hope is that this is a relatively easy objective to align the agent with, as opposed to aligning it with the full range of human values. So, our desideratum is that the AI achieves something while doing as little in the world as possible .
arXiv.org Artificial Intelligence
Mar-6-2023
- Country:
- Asia > Bangladesh
- Dhaka Division > Dhaka District > Dhaka (0.04)
- Europe
- Poland (0.04)
- United Kingdom > England
- Oxfordshire > Oxford (0.14)
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- New York > New York County
- New York City (0.04)
- California > Santa Clara County
- Asia > Bangladesh
- Genre:
- Research Report (0.40)
- Technology: