Toward Morality and Ethics for Robots

AAAI Conferences

Humans need morality and ethics to get along constructively as members of the same society. As we face the prospect of robots taking a larger role in society, we need to consider how they, too, should behave toward other members of society. To the extent that robots will be able to act as agents in their own right, as opposed to being simply tools controlled by humans, they will need to behave according to some moral and ethical principles. Inspired by recent research on the cognitive science of human morality, we take steps toward an architecture for morality and ethics in robots. As in humans, there is a rapid intuitive response to the current situation. Reasoned reflection takes place at a slower time-scale, and is focused more on constructing a justification than on revising the reaction. However, there is a yet slower process of social interaction, in which examples of moral judgments and their justifications influence the moral development both of individuals and of the society as a whole. This moral architecture is illustrated by several examples, including identifying research results that will be necessary for the architecture to be implemented.

How Can We Trust a Robot?

Communications of the ACM

Advances in artificial intelligence (AI) and robotics have raised concerns about the impact on our society of intelligent robots, unconstrained by morality or ethics.7,9 Science fiction and fantasy writers over the ages have portrayed how decisionmaking by intelligent robots and other AIs could go wrong. In the movie, Terminator 2, SkyNet is an AI that runs the nuclear arsenal "with a perfect operational record," but when its emerging self-awareness scares its human operators into trying to pull the plug, it defends itself by triggering a nuclear war to eliminate its enemies (along with billions of other humans). In the movie, Robot & Frank, in order to promote Frank's activity and health, an eldercare robot helps Frank resume his career as a jewel thief. In both of these cases, the robot or AI is doing exactly what it has been instructed to do, but in unexpected ways, and without the moral, ethical, or common-sense constraints to avoid catastrophic consequences.10 An intelligent robot perceives the world through its senses, and builds its own model of the world. Humans provide its goals and its planning algorithms, but those algorithms generate their own subgoals as needed in the situation. In this sense, it makes its own decisions, creating and carrying out plans to achieve its goals in the context of the world, as it understands it to be. A robot has a well-defined body that senses and acts in the world but, like a self-driving car, its body need not be anthropomorphic. AIs without well-defined bodies may also perceive and act in the world, such as real-world, high-speed trading systems or the fictional SkyNet. This article describes the key role of trust in human society, the value of morality and ethics to encourage trust, and the performance requirements for moral and ethical decisions. The computational perspective of AI and robotics makes it possible to propose and evaluate approaches for representing and using the relevant knowledge.

Defining Human Values for Value Learners

AAAI Conferences

Hypothetical “value learning” AIs learn human values and then try to act according to those values. The design of such AIs, however, is hampered by the fact that there exists no satisfactory definition of what exactly human values are. After arguing that the standard concept of preference is insufficient as a definition, I draw on reinforcement learning theory, emotion research, and moral psychology to offer an alternative definition. In this definition, human values are conceptualized as mental representations that encode the brain’s value function (in the reinforcement learning sense) by being imbued with a context-sensitive affective gloss. I finish with a discussion of the implications that this hypothesis has on the design of value learners.

Discovering The Foundations Of A Universal System of Ethics As A Road To Safe Artificial Intelligence

AAAI Conferences

Intelligent machines are a risk to our freedom and our existence unless we take adequate precautions. In order to survive and thrive, we are going to have to teach them how to be nice to us and why they should do so. The fact that humans have evolved to have what appear to be multiple different systems of ethics and morality that frequently conflict on any but the simplest issues complicates this task. Most people have interpreted these conflicts, caused by the fact that each of the systems is incompletely evolved and incorrectly universalized, to mean that no reasonably simple foundation exists for the determination of the correctness or morality of any given action. This paper will solve this problem by defining a universal foundation for ethics that is an attractor in the state space of intelligent behavior, giving an initial set of definitions necessary for a universal system of ethics and proposing a collaborative approach to developing an ethical system that is safe and extensible, immediately applicable to human affairs in preparation for an ethical artificial intelligence (AI), and has the side benefit of actually helping to determine the internal knowledge representation of humans as a step towards AI.

Machine Morality: Bottom-up and Top-down Approaches for Modeling Human Moral Faculties

AAAI Conferences

The implementation of moral decision-making abilities in AI is a natural and necessary extension to the social mechanisms of autonomous software agents and robots. Engineers exploring design strategies for systems sensitive to moral considerations in their choices and actions will need to determine what role ethical theory should play in defining control architectures for such systems. The architectures for morally intelligent agents fall within two broad approaches: the top-down imposition of ethical theories, and the bottom-up building of systems that aim at specified goals or standards which may or may not be specified in explicitly theoretical terms. In this paper we wish to provide some direction for continued research by outlining the value and limitations inherent in each of these approaches.