Commonsense Reasoning
Planning, Executing, and Evaluating the Winograd Schema Challenge
Morgenstern, Leora (Leidos) | Davis, Ernest (New York University) | Ortiz, Charles L. (Nuance)
The Winograd Schema Challenge was proposed by Hector Levesque in 2011 as an alternative to the Turing Test. Chief among its features is a simple question format that can span many commonsense knowledge domains. Questions are chosen so that they do not require specialized knoweldge or training, and are easy for humans to answer. This article details our plans to run the WSC and evaluate results.
Planning, Executing, and Evaluating the Winograd Schema Challenge
Morgenstern, Leora (Leidos) | Davis, Ernest (New York University) | Ortiz, Charles L. (Nuance)
Turing test turns out to be highly susceptible to systems that few people would wish to call intelligent. The Loebner Prize Competition (Christian 2011) is in particular associated with the development of chatterbots that are best viewed as successors to ELIZA (Weizenbaum 1966), the program that fooled people into thinking that they were talking to a human psychotherapist by cleverly turning a person's statements into questions of the sort a therapist would ask. The knowledge and inference that characterize conversations of substance -- for example, discussing alternate metaphors in sonnets of Shakespeare -- and which Turing presented as examples of the sorts of conversation that an intelligent system should be able to produce, are absent in these chatterbots. The focus is merely on engaging in surfacelevel conversation that can fool some humans who do not delve too deeply into a conversation, for at least a few minutes, into thinking that they are speaking to another person. The test taker, however, who is given a commonsense knowledge.
The Social-Emotional Turing Challenge
Jarrold, William (Nuance Communications) | Yeh, Peter Z. (Nuance Communications)
Social-emotional intelligence is an essential part of being a competent human and is thus required for human-level AI. When considering alternatives to the Turing Test it is therefore a capacity that is important to test. We characterize this capacity as affective theory of mind and describe some unique challenges associated with its interpretive or generative nature. Mindful of these challenges we describe a five-step method along with preliminary investigations into its application. We also describe certain characteristics of the approach such as its incremental nature, and countermeasures that make it difficult to game or cheat.
How to Write Science Questions that Are Easy for People and Hard for Computers
Davis, Ernest (New York University)
As a challenge problem for AI systems, I propose the use of hand-constructed multiple-choice tests, with problems that are easy for people but hard for computers. Specifically, I discuss techniques for constructing such problems at the level of a fourth-grade child and at the level of a high-school student. For the fourth grade level questions, I argue that questions that require the understanding of time, impossible or pointless scenarios, of causality, of the human body, or of sets of objects, and questions that require combining facts or require simple inductive arguments of indeterminate length can be chosen to be easy for people, and are likely to be hard for AI programs, in the current state of the art. For the high-school level, I argue that questions that relate the formal science to the realia of laboratory experiments or of real-world observations are likely to be easy for people and hard for AI programs. I argue that these are more useful benchmarks than existing standardized tests such as the SATs or Regents tests. Since the questions in standardized tests are designed to be hard for people, they often leave many aspects of what is hard for computers but easy for people untested
My Computer Is an Honor Student โ but How Intelligent Is It? Standardized Tests as a Measure of AI
Clark, Peter (Allen Institute for Artificial Intelligence) | Etzioni, Oren (Allen Institute for Artificial Intelligence)
Given the well-known limitations of the Turing Test, there is a need for objective tests to both focus attention on, and measure progress towards, the goals of AI. In this paper we argue that machine performance on standardized tests should be a key component of any new measure of AI, because attaining a high level of performance requires solving significant AI problems involving language understanding and world modeling - critical skills for any machine that lays claim to intelligence. In addition, standardized tests have all the basic requirements of a practical test: they are accessible, easily comprehensible, clearly measurable, and offer a graduated progression from simple tasks to those requiring deep understanding of the world. Here we propose this task as a challenge problem for the community, summarize our state-of-the-art results on math and science tests, and provide supporting datasets
Automatic Extraction of Events-Based Conditional Commonsense Knowledge
Sharma, Arpit (Arizona State University) | Baral, Chitta (Arizona State University)
Reasoning with commonsense knowledge plays an important role in various NLU tasks. Often the commonsense knowledge is needed to be extracted separately. In this paper we present our work of automatically extracting a certain type of commonsense knowledge. The knowledge resembles the kind that humans have about the events and the entities that participate in those events. One example of such knowledge is that "IF A bullying B causes T rescued Z THEN (possibly) Z = B ''. We call this knowledge an event-based conditional commonsense. Our approach involves semantic parsing of natural language sentences by using the Knowledge Parser (K-Parser) and extracting the knowledge, if found. We extracted about 19000 instances of such knowledge from the Open American National Corpus.
An AI with 30 Years' Worth of Knowledge Finally Goes to Work
Having spent the past 31 years memorizing an astonishing collection of general knowledge, the artificial-intelligence engine created by Doug Lenat is finally ready to go to work. Lenat's creation is Cyc, a knowledge base of semantic information designed to give computers some understanding of how things work in the real world. Cyc has been given many thousands of facts, including lots of information that you wouldn't find in an encyclopedia because it seems self-evident. It knows, for example, that that Sir Isaac Newton is a famous historical figure who is no longer alive. But more important, Cyc also understands that if you let go of an apple it will fall to the ground; that an apple is not bigger than a person; and that a person cannot throw an apple into space.
The Problem of AI Consciousness
Some things in life cannot be offset by a mere net gain in intelligence. The last few years have seen the widespread recognition that sophisticated AI is under development. Bill Gates, Stephen Hawking, and others warn of the rise of "superintelligent" machines: AIs that outthink the smartest humans in every domain, including common sense reasoning and social skills. Superintelligence could destroy us, they caution. In contrast, Ray Kurzweil, a Google director of engineering, depicts a technological utopia bringing about the end of disease, poverty and resource scarcity.
The Problem of AI Consciousness
Some things in life cannot be offset by a mere net gain in intelligence. The last few years have seen the widespread recognition that sophisticated AI is under development. Bill Gates, Stephen Hawking, and others warn of the rise of "superintelligent" machines: AIs that outthink the smartest humans in every domain, including common sense reasoning and social skills. Superintelligence could destroy us, they caution. In contrast, Ray Kurzweil, Google's chief engineer, depicts a technological utopia bringing about the end of disease, poverty and resource scarcity.
Reports on the 2015 AAAI Spring Symposium Series
Agarwal, Nitin (University of Arkansas at Little Rock) | Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft Research) | Fang, Fei (University of Southern California) | Fenstermacher, Laurie (Wright-Patterson Air Force Base) | Kagal, Lalana (Massachusetts Institute of Technology) | Kido, Takashi (Rikengenesis) | Kiekintveld, Christopher (University of Texas at El Paso) | Lawless, W. F. (Paine College) | Liu, Huan (Arizona State University) | McCallum, Andrew (University of Massachusetts) | Purohit, Hemant (Wright State University) | Seneviratne, Oshani (Massachusetts Institute of Technology) | Takadama, Keiki (University of Electro-Communications) | Taylor, Gavin (US Naval Academy)
The AAAI 2015 Spring Symposium Series was held Monday through Wednesday, March 23-25, at Stanford University near Palo Alto, California. The titles of the seven symposia were Ambient Intelligence for Health and Cognitive Enhancement, Applied Computational Game Theory, Foundations of Autonomy and Its (Cyber) Threats: From Individuals to Interdependence, Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Logical Formalizations of Commonsense Reasoning, Socio-Technical Behavior Mining: From Data to Decisions, Structured Data for Humanitarian Technologies: Perfect Fit or Overkill? and Turn-Taking and Coordination in Human-Machine Interaction.The highlights of each symposium are presented in this report.