Collaborating Authors

Mimetic vs Anchored Value Alignment in Artificial Intelligence Artificial Intelligence

"Value alignment" (VA) is considered as one of the top priorities in AI research. Much of the existing research focuses on the "A" part and not the "V" part of "value alignment." This paper corrects that neglect by emphasizing the "value" side of VA and analyzes VA from the vantage point of requirements in value theory, in particular, of avoiding the "naturalistic fallacy"--a major epistemic caveat. The paper begins by isolating two distinct forms of VA: "mimetic" and "anchored." Then it discusses which VA approach better avoids the naturalistic fallacy. The discussion reveals stumbling blocks for VA approaches that neglect implications of the naturalistic fallacy. Such problems are more serious in mimetic VA since the mimetic process imitates human behavior that may or may not rise to the level of correct ethical behavior. Anchored VA, including hybrid VA, in contrast, holds more promise for future VA since it anchors alignment by normative concepts of intrinsic value.

The Mimicry Game: Towards Self-recognition in Chatbots Artificial Intelligence

In standard Turing test, a machine has to prove its humanness to the judges. By successfully imitating a thinking entity such as a human, this machine then proves that it can also think. However, many objections are raised against the validity of this argument. Such objections claim that Turing test is not a tool to demonstrate existence of general intelligence or thinking activity. In this light, alternatives to Turing test are to be investigated. Self-recognition tests applied on animals through mirrors appear to be a viable alternative to demonstrate the existence of a type of general intelligence. Methodology here constructs a textual version of the mirror test by placing the chatbot (in this context) as the one and only judge to figure out whether the contacted one is an other, a mimicker, or oneself in an unsupervised manner. This textual version of the mirror test is objective, self-contained, and is mostly immune to objections raised against the Turing test. Any chatbot passing this textual mirror test should have or acquire a thought mechanism that can be referred to as the inner-voice, answering the original and long lasting question of Turing "Can machines think?" in a constructive manner.

Chinese room - Wikipedia


The Chinese room argument holds that a program cannot give a computer a "mind", "understanding" or "consciousness",[a] regardless of how intelligently or human-like the program may make the computer behave. The argument was first presented by philosopher John Searle in his paper, "Minds, Brains, and Programs", published in Behavioral and Brain Sciences in 1980. It has been widely discussed in the years since.[1] The centerpiece of the argument is a thought experiment known as the Chinese room.[2] The argument is directed against the philosophical positions of functionalism and computationalism,[3] which hold that the mind may be viewed as an information-processing system operating on formal symbols. The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds.[b] Although it was originally presented in reaction to the statements of artificial intelligence (AI) researchers, it is not an argument against the goals of AI research, because it does not limit the amount of intelligence a machine can display.[4] The argument applies only to digital computers running programs and does not apply to machines in general.[5] Searle's thought experiment begins with this hypothetical premise: suppose that artificial intelligence research has succeeded in constructing a computer that behaves as if it understands Chinese. It takes Chinese characters as input and, by following the instructions of a computer program, produces other Chinese characters, which it presents as output. Suppose, says Searle, that this computer performs its task so convincingly that it comfortably passes the Turing test: it convinces a human Chinese speaker that the program is itself a live Chinese speaker. To all of the questions that the person asks, it makes appropriate responses, such that any Chinese speaker would be convinced that they are talking to another Chinese-speaking human being.

Human Indignity: From Legal AI Personhood to Selfish Memes Artificial Intelligence

Debates about rights are frequently framed around the concept of legal personhood, which is granted not just to human beings but also to some nonhuman entities, such as firms, corporations or governments. Legal entities, aka legal persons are granted certain privileges and responsibilities by the jurisdictions in which they are recognized, and many such rights are not available to nonperson agents. Attempting to secure legal personhood is often seen as a potential pathway to get certain rights and protections for animals [1], fetuses [2], trees, rivers [3] and artificially intelligent (AI) agents [4]. It is commonly believed that a court ruling or a legislative action is necessary to grant personhood to a new type of entity, but recent legal literature [5-8] suggests that loopholes in the current law may permit granting of legal personhood to currently existing AI/software without having to change the law or persuade any court.

Personal Universes: A Solution to the Multi-Agent Value Alignment Problem Artificial Intelligence

Since the birth of the field of Artificial Intelligence (AI) researchers worked on creating ever capable machines, but with recent success in multiple subdomains of AI [1-7] safety and security of such systems and predicted future superintelligences [8, 9] has become paramount [10, 11]. While many diverse safety mechanisms are being investigated [12, 13], the ultimate goal is to align AI with goals, values and preferences of its users which is likely to include all of humanity. Value alignment problem [14], can be decomposed into three sub-problems, namely: personal value extraction from individual persons, combination of such personal preferences in a way, which is acceptable to all, and finally production of an intelligent system, which implements combined values of humanity. A number of approaches for extracting values [15-17] from people have been investigated, including inverse reinforcement learning [18, 19], brain scanning [20], value learning from literature [21], and understanding of human cognitive limitations [22]. Assessment of potential for success for particular techniques of value extraction is beyond the scope of this paper and we simply assume that one of the current methods, their combination, or some future approach will allow us to accurately learn values of given people.