Goto

Collaborating Authors

 foil




Foundations of Symbolic Languages for Model Interpretability

Neural Information Processing Systems

Several queries and scores have recently been proposed to explain individual predictions over ML models. Examples include queries based on "anchors", which are parts of an instance that are sufficient to justify its classification, and "feature-perturbation" scores such as SHAP. Given the need for flexible, reliable, and easy-to-apply interpretability methods for ML models, we foresee the need for developing declarative languages to naturally specify different explainability queries. We do this in a principled way by rooting such a language in a logic called FOIL, which allows for expressing many simple but important explainability queries, and might serve as a core for more expressive interpretability languages. We study the computational complexity of FOIL queries over two classes of ML models often deemed to be easily interpretable: decision trees and more general decision diagrams. Since the number of possible inputs for an ML model is exponential in its dimension, tractability of the FOIL evaluation problem is delicate but can be achieved by either restricting the structure of the models, or the fragment of FOIL being evaluated. We also present a prototype implementation of FOIL wrapped in a high-level declarative language and perform experiments showing that such a language can be used in practice.


Appendix Organization The supplementary material is organized as follows: Section A presents a brief

Neural Information Processing Systems

Performance Data Set which serve to show the usability of our implementation in practice. Section J explains the binarization process for real-valued decision trees and high-level queries. We review the definition of first-order logic (FO) over vocabularies consisting only of relations. If x,y are variables, then x = y is an FO-formula over σ . This proof requires some background in model theory.


Foundations of Symbolic Languages for Model Interpretability Marcelo Arenas 1,4, Daniel Baez

Neural Information Processing Systems

Several queries and scores have been proposed to explain individual predictions made by ML models. Examples include queries based on "anchors", which are parts of an instance that are sufficient to justify its classification, and "feature-perturbation" scores such as SHAP .


Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students' (Mis)Understanding Is Hinted

Shimmei, Machi, Uto, Masaki, Matsubayashi, Yuichiroh, Inui, Kentaro, Mallavarapu, Aditi, Matsuda, Noboru

arXiv.org Artificial Intelligence

The primary goal of this study is to develop and evaluate an innovative prompting technique, AnaQuest, for generating multiple-choice questions (MCQs) using a pre-trained large language model. In AnaQuest, the choice items are sentence-level assertions about complex concepts. The technique integrates formative and summative assessments. In the formative phase, students answer open-ended questions for target concepts in free text. For summative assessment, AnaQuest analyzes these responses to generate both correct and incorrect assertions. To evaluate the validity of the generated MCQs, Item Response Theory (IRT) was applied to compare item characteristics between MCQs generated by AnaQuest, a baseline ChatGPT prompt, and human-crafted items. An empirical study found that expert instructors rated MCQs generated by both AI models to be as valid as those created by human instructors. However, IRT-based analysis revealed that AnaQuest-generated questions - particularly those with incorrect assertions (foils) - more closely resembled human-crafted items in terms of difficulty and discrimination than those produced by ChatGPT.


Sentence Smith: Formally Controllable Text Transformation and its Application to Evaluation of Text Embedding Models

Li, Hongji, Michail, Andrianos, Gubelmann, Reto, Clematide, Simon, Opitz, Juri

arXiv.org Artificial Intelligence

We propose the Sentence Smith framework that enables controlled and specified manipulation of text meaning. It consists of three main steps: 1. Parsing a sentence into a semantic graph, 2. Applying human-designed semantic manipulation rules, and 3. Generating text from the manipulated graph. A final filtering step (4.) ensures the validity of the applied transformation. To demonstrate the utility of Sentence Smith in an application study, we use it to generate hard negative pairs that challenge text embedding models. Since the controllable generation makes it possible to clearly isolate different types of semantic shifts, we can gain deeper insights into the specific strengths and weaknesses of widely used text embedding models, also addressing an issue in current benchmarking where linguistic phenomena remain opaque. Human validation confirms that the generations produced by Sentence Smith are highly accurate.


Foundations of Symbolic Languages for Model Interpretability

Neural Information Processing Systems

Several queries and scores have recently been proposed to explain individual predictions over ML models. Examples include queries based on "anchors", which are parts of an instance that are sufficient to justify its classification, and "feature-perturbation" scores such as SHAP. Given the need for flexible, reliable, and easy-to-apply interpretability methods for ML models, we foresee the need for developing declarative languages to naturally specify different explainability queries. We do this in a principled way by rooting such a language in a logic called FOIL, which allows for expressing many simple but important explainability queries, and might serve as a core for more expressive interpretability languages. We study the computational complexity of FOIL queries over two classes of ML models often deemed to be easily interpretable: decision trees and more general decision diagrams.


CE-MRS: Contrastive Explanations for Multi-Robot Systems

Schneider, Ethan, Wu, Daniel, Das, Devleena, Chernova, Sonia

arXiv.org Artificial Intelligence

As the complexity of multi-robot systems grows to incorporate a greater number of robots, more complex tasks, and longer time horizons, the solutions to such problems often become too complex to be fully intelligible to human users. In this work, we introduce an approach for generating natural language explanations that justify the validity of the system's solution to the user, or else aid the user in correcting any errors that led to a suboptimal system solution. Toward this goal, we first contribute a generalizable formalism of contrastive explanations for multi-robot systems, and then introduce a holistic approach to generating contrastive explanations for multi-robot scenarios that selectively incorporates data from multi-robot task allocation, scheduling, and motion-planning to explain system behavior. Through user studies with human operators we demonstrate that our integrated contrastive explanation approach leads to significant improvements in user ability to identify and solve system errors, leading to significant improvements in overall multi-robot team performance.


Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills

Buçinca, Zana, Swaroop, Siddharth, Paluch, Amanda E., Doshi-Velez, Finale, Gajos, Krzysztof Z.

arXiv.org Artificial Intelligence

People's decision-making abilities often fail to improve or may even erode when they rely on AI for decision-support, even when the AI provides informative explanations. We argue this is partly because people intuitively seek contrastive explanations, which clarify the difference between the AI's decision and their own reasoning, while most AI systems offer "unilateral" explanations that justify the AI's decision but do not account for users' thinking. To align human-AI knowledge on decision tasks, we introduce a framework for generating human-centered contrastive explanations that explain the difference between AI's choice and a predicted, likely human choice about the same task. Results from a large-scale experiment (N = 628) demonstrate that contrastive explanations significantly enhance users' independent decision-making skills compared to unilateral explanations, without sacrificing decision accuracy. Amid rising deskilling concerns, our research demonstrates that incorporating human reasoning into AI design can foster human skill development.