Case-Based Reasoning
Interpretable Machine Learning: Fundamental Principles and 10 Grand Challenges
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of these problems are classically important, and some are recent problems that have arisen in the last few years. These problems are: (1) Optimizing sparse logical models such as decision trees; (2) Optimization of scoring systems; (3) Placing constraints into generalized additive models to encourage sparsity and better interpretability; (4) Modern case-based reasoning, including neural networks and matching for causal inference; (5) Complete supervised disentanglement of neural networks; (6) Complete or even partial unsupervised disentanglement of neural networks; (7) Dimensionality reduction for data visualization; (8) Machine learning models that can incorporate physics and other generative or causal constraints; (9) Characterization of the "Rashomon set" of good models; and (10) Interpretable reinforcement learning. This survey is suitable as a starting point for statisticians and computer scientists interested in working in interpretable machine learning.
Under-bagging Nearest Neighbors for Imbalanced Classification
Hang, Hanyuan, Cai, Yuchao, Yang, Hanfang, Lin, Zhouchen
In this paper, we propose an ensemble learning algorithm called \textit{under-bagging $k$-nearest neighbors} (\textit{under-bagging $k$-NN}) for imbalanced classification problems. On the theoretical side, by developing a new learning theory analysis, we show that with properly chosen parameters, i.e., the number of nearest neighbors $k$, the expected sub-sample size $s$, and the bagging rounds $B$, optimal convergence rates for under-bagging $k$-NN can be achieved under mild assumptions w.r.t.~the arithmetic mean (AM) of recalls. Moreover, we show that with a relatively small $B$, the expected sub-sample size $s$ can be much smaller than the number of training data $n$ at each bagging round, and the number of nearest neighbors $k$ can be reduced simultaneously, especially when the data are highly imbalanced, which leads to substantially lower time complexity and roughly the same space complexity. On the practical side, we conduct numerical experiments to verify the theoretical results on the benefits of the under-bagging technique by the promising AM performance and efficiency of our proposed algorithm.
Bridging Case-Based Reasoning, DL and XAI at the First Virtual ICCBR Conference (ICCBR2020)
Ian Watson, Rosina O Weber, David Leake Case-based reasoning is reasoning from experience, solving new problems and interpreting new situations by retrieving and adapting prior cases. The Twenty-Eight International Conference on Case-Based Reasoning (ICCBR2020) was held from June 8-12, 2020, with program chairs Ian Watson and Rosina Weber. The conference was originally scheduled for Salamanca, Spain, a World Heritage site, under the auspices of local chair Juan Manuel Corchado and the University of Salamanca. Its theme, "CBR Across Bridges", reflected the goal of bringing together researchers and practitioners with relevant work across various AI areas. Before the conference, the pandemic struck, with tragic effects. The conference chairs resolved to continue with a safe alternative: the first virtual ICCBR. With researchers unable to travel, the virtual conference not only bridged AI areas but geographic ones: 141 conference attendees participated from 23 countries.
ProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained Interpretability
Rymarczyk, Dawid, Kaczyลska, Aneta, Kraus, Jarosลaw, Pardyl, Adam, Zieliลski, Bartosz
Multiple Instance Learning (MIL) gains popularity in many real-life machine learning applications due to its weakly supervised nature. However, the corresponding effort on explaining MIL lags behind, and it is usually limited to presenting instances of a bag that are crucial for a particular prediction. In this paper, we fill this gap by introducing ProtoMIL, a novel self-explainable MIL method inspired by the case-based reasoning process that operates on visual prototypes. Thanks to incorporating prototypical features into objects description, ProtoMIL unprecedentedly joins the model accuracy and fine-grained interpretability, which we present with the experiments on five recognized MIL datasets.
Longitudinal Distance: Towards Accountable Instance Attribution
Weber, Rosina O., Goel, Prateek, Amiri, Shideh, Simpson, Gideon
Previous research in interpretable machine learning (IML) and explainable artificial intelligence (XAI) can be broadly categorized as either focusing on seeking interpretability in the agent's model (i.e., IML) or focusing on the context of the user in addition to the model (i.e., XAI). The former can be categorized as feature or instance attribution. Example- or sample-based methods such as those using or inspired by case-based reasoning (CBR) rely on various approaches to select instances that are not necessarily attributing instances responsible for an agent's decision. Furthermore, existing approaches have focused on interpretability and explainability but fall short when it comes to accountability. Inspired in case-based reasoning principles, this paper introduces a pseudo-metric we call Longitudinal distance and its use to attribute instances to a neural network agent's decision that can be potentially used to build accountable CBR agents.
An Introduction to AI Story Generation
Automated story generation is the use of an intelligent system to produce a fictional story from a minimal set of inputs. This is a problem that has long been explored by AI researchers, since it strikes at some fundamental research questions in artificial intelligence. To tell a story, an intelligent system has to have a lot of knowledge, both about how to tell a story and about how the world works. These concepts need to be grounded to be able to tell coherent stories. Story generation is therefore an excellent way to know if an intelligent system truly understands something. To understand a concept, one must be able to put that concept into practice -- telling a story in which a concept is used correctly is one way of doing that. For example, if an AI system tells a story about going to a restaurant, as simple as that sounds, we discover very quickly what the system doesn't understand when it messes up basic details. Besides understanding concepts, storytelling also requires an understanding of the listener or reader, known as a theory of mind -- a model of the listener to reason about what needs to be said or what can be left out and still convey a comprehensible story. In addition to these fundamental AI research problems, automated story generation is also worth studying for the applications it may enable. The remainder of this article will present a primer on the field of research that I think my students need to know to get started on research on automated story generation, and that anyone interested in the topic of automated story generation may find it informative. A caveat: since I have been actively researching automated story generation for nearly two decades, this primer will be somewhat biased toward work from my research group and collaborators. We might distinguish between automated story generation and automated plot generation.
Analogical Learning in Tactical Decision Games
Hinrichs, Tom, Dunham, Greg, Forbus, Ken
Tactical Decision Games (TDGs) are military conflict scenarios presented both textually and graphically on a map. These scenarios provide a challenging domain for machine learning because they are open-ended, highly structured, and typically contain many details of varying relevance. We have developed a problem-solving component of an interactive companion system that proposes military tasks to solve TDG scenarios using a combination of analogical retrieval, mapping, and constraint propagation. We use this problem-solving component to explore analogical learning. In this paper, we describe the problems encountered in learning for this domain, and the methods we have developed to address these, such as partition constraints on analogical mapping correspondences and the use of incremental remapping to improve robustness. We present the results of learning experiments that show improvement in performance through the simple accumulation of examples, despite a weak domain theory.
The application of artificial intelligence in software engineering: a review challenging conventional wisdom
Batarseh, Feras A., Mohod, Rasika, Kumar, Abhinav, Bui, Justin
The field of artificial intelligence (AI) is witnessing a recent upsurge in research, tools development, and deployment of applications. Multiple software companies are shifting their focus to developing intelligent systems; and many others are deploying AI paradigms to their existing processes. In parallel, the academic research community is injecting AI paradigms to provide solutions to traditional engineering problems. Similarly, AI has evidently been proved useful to software engineering (SE). When one observes the SE phases (requirements, design, development, testing, release, and maintenance), it becomes clear that multiple AI paradigms (such as neural networks, machine learning, knowledge-based systems, natural language processing) could be applied to improve the process and eliminate many of the major challenges that the SE field has been facing. This survey chapter is a review of the most commonplace methods of AI applied to SE. The review covers methods between years 1975-2017, for the requirements phase, 46 major AI-driven methods are found, 19 for design, 15 for development, 68 for testing, and 15 for release and maintenance. Furthermore, the purpose of this chapter is threefold; firstly, to answer the following questions: is there sufficient intelligence in the SE lifecycle? What does applying AI to SE entail? Secondly, to measure, formulize, and evaluate the overlap of SE phases and AI disciplines. Lastly, this chapter aims to provide serious questions to challenging the current conventional wisdom (i.e., status quo) of the state-of-the-art, craft a call for action, and to redefine the path forward.
Task and Situation Structures for Service Agent Planning
Yang, Hao, Eftekhar, Tavan, Esselink, Chad, Ding, Yan, Zhang, Shiqi
Everyday tasks are characterized by their varieties and variations, and frequently are not clearly specified to service agents. This paper presents a comprehensive approach to enable a service agent to deal with everyday tasks in open, uncontrolled environments. We introduce a generic structure for representing tasks, and another structure for representing situations. Based on the two newly introduced structures, we present a methodology of situation handling that avoids hard-coding domain rules while improving the scalability of real-world task planning systems.
Applying the Case Difference Heuristic to Learn Adaptations from Deep Network Features
Ye, Xiaomeng, Zhao, Ziwei, Leake, David, Wang, Xizi, Crandall, David
The case difference heuristic (CDH) approach is a knowledge-light method for learning case adaptation knowledge from the case base of a case-based reasoning system. Given a pair of cases, the CDH approach attributes the difference in their solutions to the difference in the problems they solve, and generates adaptation rules to adjust solutions accordingly when a retrieved case and new query have similar problem differences. As an alternative to learning adaptation rules, several researchers have applied neural networks to learn to predict solution differences from problem differences. Previous work on such approaches has assumed that the feature set describing problems is predefined. This paper investigates a two-phase process combining deep learning for feature extraction and neural network based adaptation learning from extracted features. Its performance is demonstrated in a regression task on an image data: predicting age given the image of a face. Results show that the combined process can successfully learn adaptation knowledge applicable to nonsymbolic differences in cases. The CBR system achieves slightly lower performance overall than a baseline deep network regressor, but better performance than the baseline on novel queries.