An intelligent system must be capable of performing automated reasoning as well as responding to the changing environment (for example, changing knowledge). To exhibit such an intelligent behavior, a machine needs to understand its environment as well be able to interact with it to achieve certain goals. For acting rationally, a machine must be able to obtain information and understand it. Knowledge Representation (KR) is an important step of automated reasoning, where the knowledge about the world is represented in a way such that a machine can understand and process. Also, it must be able to accommodate the changes about the world (i.e., the new or updated knowledge).
Acquiring commonsense knowledge and reasoning is recognized as an important frontier in achieving general Artificial Intelligence (AI). Recent research in the Natural Language Processing (NLP) community has demonstrated significant progress in this problem setting. Despite this progress, which is mainly on multiple-choice question answering tasks in limited settings, there is still a lack of understanding (especially at scale) of the nature of commonsense knowledge itself. In this paper, we propose and conduct a systematic study to enable a deeper understanding of commonsense knowledge by doing an empirical and structural analysis of the ConceptNet knowledge base. ConceptNet is a freely available knowledge base containing millions of commonsense assertions presented in natural language.
As a challenge problem for AI systems, I propose the use of hand-constructed multiple-choice tests, with problems that are easy for people but hard for computers. Specifically, I discuss techniques for constructing such problems at the level of a fourth-grade child and at the level of a high school student. For the fourth-grade-level questions, I argue that questions that require the understanding of time, of impossible or pointless scenarios, of causality, of the human body, or of sets of objects, and questions that require combining facts or require simple inductive arguments of indeterminate length can be chosen to be easy for people, and are likely to be hard for AI programs, in the current state of the art. For the high school level, I argue that questions that relate the formal science to the realia of laboratory experiments or of real-world observations are likely to be easy for people and hard for AI programs. I argue that these are more useful benchmarks than existing standardized tests such as the SATs or New York Regents tests.
To apply eyeshadow without a brush, should I use a cotton swab or a toothpick? Questions requiring this kind of physical commonsense pose a challenge to today's natural language understanding systems. While recent pretrained models (such as BERT) have made progress on question answering over more abstract domains - such as news articles and encyclopedia entries, where text is plentiful - in more physical domains, text is inherently limited due to reporting bias. Can AI systems learn to reliably answer physical common-sense questions without experiencing the physical world? In this paper, we introduce the task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA. Though humans find the dataset easy (95% accuracy), large pretrained models struggle (77%). We provide analysis about the dimensions of knowledge that existing models lack, which offers significant opportunities for future research.
A full book, available for free in PDF form.From the preface:A major problem in artificial intelligence is to endow computers with commonsense knowledge of the world and with the ability to use that knowledge sensibly. A large body of research has studied this problem through careful analysis of typical examples of reasoning in a variety of commonsense domains. The immediate aim of this research is to develop a rich language for expressing commonsense knowledge, and inference techniques for carrying out commonsense reasoning. This book provides an introduction and a survey of this body of research. It is, to the best of my knowledge, the first book to attempt this.The book is designed to be used as a textbook for a one-semester graduate course on knowledge representation.Morgan Kaufmann