Decades of research in artificial intelligence (AI) have produced formidable technologies that are providing immense benefit to industry, government, and society. AI systems can now translate across multiple languages, identify objects in images and video, streamline manufacturing processes, and control cars. The deployment of AI systems has not only created a trillion-dollar industry that is projected to quadruple in three years, but has also exposed the need to make AI systems fair, explainable, trustworthy, and secure. Future AI systems will rightfully be expected to reason effectively about the world in which they (and people) operate, handling complex tasks and responsibilities effectively and ethically, engaging in meaningful communication, and improving their awareness through experience. Achieving the full potential of AI technologies poses research challenges that require a radical transformation of the AI research enterprise, facilitated by significant and sustained investment. These are the major recommendations of a recent community effort coordinated by the Computing Community Consortium and the Association for the Advancement of Artificial Intelligence to formulate a Roadmap for AI research and development over the next two decades.
This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create systems that explain and instruct (intelligent tutoring systems and expert systems). The Report expresses the explainability issues and challenges in modern AI, and presents capsule views of the leading psychological theories of explanation. Certain articles stand out by virtue of their particular relevance to XAI, and their methods, results, and key points are highlighted. It is recommended that AI/XAI researchers be encouraged to include in their research reports fuller details on their empirical or experimental methods, in the fashion of experimental psychology research reports: details on Participants, Instructions, Procedures, Tasks, Dependent Variables (operational definitions of the measures and metrics), Independent Variables (conditions), and Control Conditions.
What if I told a story here, how would that story start?" Thus, the summarization prompt: "My second grader asked me what this passage means: …" When a given prompt isn't working and GPT-3 keeps pivoting into other modes of completion, that may mean that one hasn't constrained it enough by imitating a correct output, and one needs to go further; writing the first few words or sentence of the target output may be necessary.
Natural language understanding (NLU) of text is a fundamental challenge in AI, and it has received significant attention throughout the history of NLP research. This primary goal has been studied under different tasks, such as Question Answering (QA) and Textual Entailment (TE). In this thesis, we investigate the NLU problem through the QA task and focus on the aspects that make it a challenge for the current state-of-the-art technology. This thesis is organized into three main parts: In the first part, we explore multiple formalisms to improve existing machine comprehension systems. We propose a formulation for abductive reasoning in natural language and show its effectiveness, especially in domains with limited training data. Additionally, to help reasoning systems cope with irrelevant or redundant information, we create a supervised approach to learn and detect the essential terms in questions. In the second part, we propose two new challenge datasets. In particular, we create two datasets of natural language questions where (i) the first one requires reasoning over multiple sentences; (ii) the second one requires temporal common sense reasoning. We hope that the two proposed datasets will motivate the field to address more complex problems. In the final part, we present the first formal framework for multi-step reasoning algorithms, in the presence of a few important properties of language use, such as incompleteness, ambiguity, etc. We apply this framework to prove fundamental limitations for reasoning algorithms. These theoretical results provide extra intuition into the existing empirical evidence in the field.
The prevalence of e-learning systems and on-line courses has made educational material widely accessible to students of varying abilities and backgrounds. There is thus a growing need to accommodate for individual differences in e-learning systems. This paper presents an algorithm called EduRank for personalizing educational content to students that combines a collaborative filtering algorithm with voting methods. EduRank constructs a difficulty ranking for each student by aggregating the rankings of similar students using different aspects of their performance on common questions. These aspects include grades, number of retries, and time spent solving questions. It infers a difficulty ranking directly over the questions for each student, rather than ordering them according to the student's predicted score. The EduRank algorithm was tested on two data sets containing thousands of students and a million records. It was able to outperform the state-of-the-art ranking approaches as well as a domain expert. EduRank was used by students in a classroom activity, where a prior model was incorporated to predict the difficulty rankings of students with no prior history in the system. It was shown to lead students to solve more difficult questions than an ordering by a domain expert, without reducing their performance.