Goto

Collaborating Authors

 omohundro


An AGI with Time-Inconsistent Preferences

arXiv.org Artificial Intelligence

This paper reveals a trap for artificial general intelligence (AGI) theorists who use economists' standard method of discounting. This trap is implicitly and falsely assuming that a rational AGI would have timeconsistent preferences. An agent with time-inconsistent preferences knows that its future self will disagree with its current self concerning intertemporal decision making. Such an agent cannot automatically trust its future self to carry out plans that its current self considers optimal. Economists have long used utility functions to model how rational agents behave (see Mas-Colell et al., 1995).


The Off-Switch Game

arXiv.org Artificial Intelligence

It is clear that one of the primary tools we can use to mitigate the potential risk from a misbehaving AI system is the ability to turn the system off. As the capabilities of AI systems improve, it is important to ensure that such systems do not adopt subgoals that prevent a human from switching them off. This is a challenge because many formulations of rational agents create strong incentives for self-preservation. This is not caused by a built-in instinct, but because a rational agent will maximize expected utility and cannot achieve whatever objective it has been given if it is dead. Our goal is to study the incentives an agent has to allow itself to be switched off. We analyze a simple game between a human H and a robot R, where H can press R's off switch but R can disable the off switch. A traditional agent takes its reward function for granted: we show that such agents have an incentive to disable the off switch, except in the special case where H is perfectly rational. Our key insight is that for R to want to preserve its off switch, it needs to be uncertain about the utility associated with the outcome, and to treat H's actions as important observations about that utility. (R also has no incentive to switch itself off in this setting.) We conclude that giving machines an appropriate level of uncertainty about their objectives leads to safer designs, and we argue that this setting is a useful generalization of the classical AI paradigm of rational agents.


Formalizing Convergent Instrumental Goals

AAAI Conferences

Omohundro has argued that sufficiently advanced AI systems of any design would, by default, have incentives to pursue a number of instrumentally useful subgoals, such as acquiring more computing power and amassing many resources. Omohundro refers to these as โ€œbasic AI drives,โ€ and he, along with Bostrom and others, has argued that this means great care must be taken when designing powerful autonomous systems, because even if they have harmless goals, the side effects of pursuing those goals may be quite harmful. These arguments, while intuitively compelling, are primarily philosophical. In this paper, we provide formal models that demonstrate Omohundroโ€™s thesis, thereby putting mathematical weight behind those intuitive claims.


Family Discovery

Neural Information Processing Systems

"Family discovery" is the task of learning the dimension and structure of a parameterized family of stochastic models. It is especially appropriate when the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning and show that they significantly improve performance over two alternatives on a parameterized classification task.


Family Discovery

Neural Information Processing Systems

"Family discovery" is the task of learning the dimension and structure of a parameterized family of stochastic models. It is especially appropriate when the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning and show that they significantly improve performance over two alternatives on a parameterized classification task.


Family Discovery

Neural Information Processing Systems

"Family discovery" is the task of learning the dimension and structure ofa parameterized family of stochastic models. It is especially appropriatewhen the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning andshow that they significantly improve performance over two alternatives on a parameterized classification task. 1 INTRODUCTION Human listeners improve their ability to recognize speech by identifying the accent of the speaker. "Might" in an American accent is similar to "mate" in an Australian accent. By first identifying the accent, discrimination between these two words is improved.