Self-Modeling Agents and Reward Generator Corruption
Hibbard, Bill (University of Wisconsin - Madison)
Hutter's universal artificial intelligence (AI) showed how to define future AI systems by mathematical equations. Here we adapt those equations to define a self-modeling framework, where AI systems learn models of their own calculations of future values. Hutter discussed the possibility that AI agents may maximize rewards by corrupting the source of rewards in the environment. Here we propose a way to avoid such corruption in the self-modeling framework. This paper fits in the context of my book Ethical Artificial Intelligence.
Mar-1-2015
- Country:
- North America > United States
- New York (0.05)
- Wisconsin > Dane County
- Madison (0.04)
- California > Alameda County
- Berkeley (0.04)
- Europe > Netherlands
- North Holland > Amsterdam (0.04)
- North America > United States
- Industry:
- Leisure & Entertainment > Games > Chess (0.69)
- Technology: