Formalizing Convergent Instrumental Goals
Benson-Tilsen, Tsvi (University of California, Berkeley) | Soares, Nate (Machine Intelligence Research Institute)
Omohundro has argued that sufficiently advanced AI systems of any design would, by default, have incentives to pursue a number of instrumentally useful subgoals, such as acquiring more computing power and amassing many resources. Omohundro refers to these as “basic AI drives,” and he, along with Bostrom and others, has argued that this means great care must be taken when designing powerful autonomous systems, because even if they have harmless goals, the side effects of pursuing those goals may be quite harmful. These arguments, while intuitively compelling, are primarily philosophical. In this paper, we provide formal models that demonstrate Omohundro’s thesis, thereby putting mathematical weight behind those intuitive claims.
- Country:
- North America > United States
- New York (0.04)
- California > San Francisco County
- San Francisco (0.14)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- North America > United States
- Technology: