Relentless progress in artificial intelligence (AI) is increasingly raising concerns that machines will replace humans on the job market, and perhaps altogether. Eliezer Yudkowski and others have explored the possibility that a promising future for humankind could be guaranteed by a superintelligent "Friendly AI", designed to safeguard humanity and its values. I will argue that, from a physics perspective where everything is simply an arrangement of elementary particles, this might be even harder than it appears. Indeed, it may require thinking rigorously about the meaning of life: What is "meaning" in a particle arrangement?
Omohundro has argued that sufficiently advanced AI systems of any design would, by default, have incentives to pursue a number of instrumentally useful subgoals, such as acquiring more computing power and amassing many resources. Omohundro refers to these as “basic AI drives,” and he, along with Bostrom and others, has argued that this means great care must be taken when designing powerful autonomous systems, because even if they have harmless goals, the side effects of pursuing those goals may be quite harmful. These arguments, while intuitively compelling, are primarily philosophical. In this paper, we provide formal models that demonstrate Omohundro’s thesis, thereby putting mathematical weight behind those intuitive claims.
This paper considers ethical, philosophical, and technical topics related to achieving beneficial human-level AI and superintelligence. Human-level AI need not be human-identical: The concept of self-preservation could be quite different for a human-level AI, and an AI system could be willing to sacrifice itself to save human life. Artificial consciousness need not be equivalent to human consciousness, and there need not be an ethical problem in switching off a purely symbolic artificial consciousness. The possibility of achieving superintelligence is discussed, including potential for ‘conceptual gulfs’ with humans, which may be bridged. Completeness conjectures are given for the ‘TalaMind’ approach to emulate human intelligence, and for the ability of human intelligence to understand the universe. The possibility and nature of strong vs. weak superintelligence are discussed. Two paths to superintelligence are described: The first path could be catastrophically harmful to humanity and life in general, perhaps leading to extinction events. The second path should improve our ability to achieve beneficial superintelligence. Human-level AI and superintelligence may be necessary for the survival and prosperity of humanity.
Now I suppose it is possible that once an agent reaches a sufficient level of intellectual ability it derives some universal morality from the ether and there really is nothing to worry about, but I hope you agree that this is, at the very least, not a conservative assumption. For the purposes of this article I will take the orthogonality thesis as a given. So a smarter than human artificial intelligence can have any goal.
Parts of this essay by Andrew Smart are adapted from his book Beyond Zero And One (2015), published by OR Books. Machine intelligence is growing at an increasingly rapid pace. The leading minds on the cutting edge of AI research think that machines with human-level intelligence will likely be realized by the year 2100. Beyond this, artificial intelligences that far outstrip human intelligence would rapidly be created by the human-level AIs. This vastly superhuman AI will result from an "intelligence explosion."