Explanation & Argumentation
CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations
Guo, Hangzhi, Nguyen, Thanh Hong, Yadav, Amulya
This work presents CounterNet, a novel end-to-end learning framework which integrates Machine Learning (ML) model training and the generation of corresponding counterfactual (CF) explanations into a single end-to-end pipeline. Counterfactual explanations offer a contrastive case, i.e., they attempt to find the smallest modification to the feature values of an instance that changes the prediction of the ML model on that instance to a predefined output. Prior techniques for generating CF explanations suffer from two major limitations: (i) all of them are post-hoc methods designed for use with proprietary ML models -- as a result, their procedure for generating CF explanations is uninformed by the training of the ML model, which leads to misalignment between model predictions and explanations; and (ii) most of them rely on solving separate time-intensive optimization problems to find CF explanations for each input data point (which negatively impacts their runtime). This work makes a novel departure from the prevalent post-hoc paradigm (of generating CF explanations) by presenting CounterNet, an end-to-end learning framework which integrates predictive model training and the generation of counterfactual (CF) explanations into a single pipeline. Unlike post-hoc methods, CounterNet enables the optimization of the CF explanation generation only once together with the predictive model. We adopt a block-wise coordinate descent procedure which helps in effectively training CounterNet's network. Our extensive experiments on multiple real-world datasets show that CounterNet generates high-quality predictions, and consistently achieves 100% CF validity and low proximity scores (thereby achieving a well-balanced cost-invalidity trade-off) for any new input instance, and runs 3X faster than existing state-of-the-art baselines.
Evaluation of Popular XAI Applied to Clinical Prediction Models: Can They be Trusted?
Brankovic, Aida, Cook, David, Rahman, Jessica, Huang, Wenjie, Khanna, Sankalp
The absence of transparency and explainability hinders the clinical adoption of Machine learning (ML) algorithms. Although various methods of explainable artificial intelligence (XAI) have been suggested, there is a lack of literature that delves into their practicality and assesses them based on criteria that could foster trust in clinical environments. To address this gap this study evaluates two popular XAI methods used for explaining predictive models in the healthcare context in terms of whether they (i) generate domain-appropriate representation, i.e. coherent with respect to the application task, (ii) impact clinical workflow and (iii) are consistent. To that end, explanations generated at the cohort and patient levels were analysed. The paper reports the first benchmarking of the XAI methods applied to risk prediction models obtained by evaluating the concordance between generated explanations and the trigger of a future clinical deterioration episode recorded by the data collection system. We carried out an analysis using two Electronic Medical Records (EMR) datasets sourced from Australian major hospitals. The findings underscore the limitations of state-of-the-art XAI methods in the clinical context and their potential benefits. We discuss these limitations and contribute to the theoretical development of trustworthy XAI solutions where clinical decision support guides the choice of intervention by suggesting the pattern or drivers for clinical deterioration in the future.
COMET: X86 Cost Model Explanation Framework
Chaudhary, Isha, Renda, Alex, Mendis, Charith, Singh, Gagandeep
ML-based program cost models have been shown to yield fairly accurate program cost predictions. They can replace heavily-engineered analytical program cost models in mainstream compilers, but their black-box nature discourages their adoption. In this work, we propose the first framework, COMET, for generating faithful, generalizable, and intuitive explanations for x86 cost models. COMET brings interpretability specifically to ML-based cost models, such as Ithemal. We generate and compare COMET's explanations for Ithemal against COMET's explanations for a hand-crafted, accurate analytical model, uiCA. Our empirical findings show an inverse correlation between the error in the cost prediction of a cost model and the prominence of semantically-richer features in COMET's explanations for the cost model for a given x86 basic block.
Detection of Sensor-To-Sensor Variations using Explainable AI
Seifi, Sarah, Schober, Sebastian A., Carbonelli, Cecilia, Servadei, Lorenzo, Wille, Robert
With the growing concern for air quality and its impact on human health, interest in environmental gas monitoring has increased. However, chemi-resistive gas sensing devices are plagued by issues of sensor reproducibility during manufacturing. This study proposes a novel approach for detecting sensor-to-sensor variations in sensing devices using the explainable AI (XAI) method of SHapley Additive exPlanations (SHAP). This is achieved by identifying sensors that contribute the most to environmental gas concentration estimation via machine learning, and measuring the similarity of feature rankings between sensors to flag deviations or outliers. The methodology is tested using artificial and realistic Ozone concentration profiles to train a Gated Recurrent Unit (GRU) model. Two applications were explored in the study: the detection of wrong behaviors of sensors in the train dataset, and the detection of deviations in the test dataset. By training the GRU with the pruned train dataset, we could reduce computational costs while improving the model performance. Overall, the results show that our approach improves the understanding of sensor behavior, successfully detects sensor deviations down to 5-10% from the normal behavior, and leads to more efficient model preparation and calibration. Our method provides a novel solution for identifying deviating sensors, linking inconsistencies in hardware to sensor-to-sensor variations in the manufacturing process on an AI model-level.
Cross-Domain Toxic Spans Detection
Schouten, Stefan F., Barbarestani, Baran, Tufa, Wondimagegnhue, Vossen, Piek, Markov, Ilia
Given the dynamic nature of toxic language use, automated methods for detecting toxic spans are likely to encounter distributional shift. To explore this phenomenon, we evaluate three approaches for detecting toxic spans under cross-domain conditions: lexicon-based, rationale extraction, and fine-tuned language models. Our findings indicate that a simple method using off-the-shelf lexicons performs best in the cross-domain setup. The cross-domain error analysis suggests that (1) rationale extraction methods are prone to false negatives, while (2) language models, despite performing best for the in-domain case, recall fewer explicitly toxic words than lexicons and are prone to certain types of false positives. Our code is publicly available at: https://github.com/
Iterative Partial Fulfillment of Counterfactual Explanations: Benefits and Risks
Counterfactual (CF) explanations, also known as contrastive explanations and algorithmic recourses, are popular for explaining machine learning models in high-stakes domains. For a subject that receives a negative model prediction (e.g., mortgage application denial), the CF explanations are similar instances but with positive predictions, which informs the subject of ways to improve. While their various properties have been studied, such as validity and stability, we contribute a novel one: their behaviors under iterative partial fulfillment (IPF). Specifically, upon receiving a CF explanation, the subject may only partially fulfill it before requesting a new prediction with a new explanation, and repeat until the prediction is positive. Such partial fulfillment could be due to the subject's limited capability (e.g., can only pay down two out of four credit card accounts at this moment) or an attempt to take the chance (e.g., betting that a monthly salary increase of \$800 is enough even though \$1,000 is recommended). Does such iterative partial fulfillment increase or decrease the total cost of improvement incurred by the subject? We mathematically formalize IPF and demonstrate, both theoretically and empirically, that different CF algorithms exhibit vastly different behaviors under IPF. We discuss implications of our observations, advocate for this factor to be carefully considered in the development and study of CF algorithms, and give several directions for future work.
Towards Interpretability in Audio and Visual Affective Machine Learning: A Review
Johnson, David S., Hakobyan, Olya, Drimalla, Hanna
Machine learning is frequently used in affective computing, but presents challenges due the opacity of state-of-the-art machine learning methods. Because of the impact affective machine learning systems may have on an individual's life, it is important that models be made transparent to detect and mitigate biased decision making. In this regard, affective machine learning could benefit from the recent advancements in explainable artificial intelligence (XAI) research. We perform a structured literature review to examine the use of interpretability in the context of affective machine learning. We focus on studies using audio, visual, or audiovisual data for model training and identified 29 research articles. Our findings show an emergence of the use of interpretability methods in the last five years. However, their use is currently limited regarding the range of methods used, the depth of evaluations, and the consideration of use-cases. We outline the main gaps in the research and provide recommendations for researchers that aim to implement interpretable methods for affective machine learning.
Understanding the Role of Human Intuition on Reliance in Human-AI Decision-Making with Explanations
Chen, Valerie, Liao, Q. Vera, Vaughan, Jennifer Wortman, Bansal, Gagan
AI explanations are often mentioned as a way to improve human-AI decision-making, but empirical studies have not found consistent evidence of explanations' effectiveness and, on the contrary, suggest that they can increase overreliance when the AI system is wrong. While many factors may affect reliance on AI support, one important factor is how decision-makers reconcile their own intuition -- beliefs or heuristics, based on prior knowledge, experience, or pattern recognition, used to make judgments -- with the information provided by the AI system to determine when to override AI predictions. We conduct a think-aloud, mixed-methods study with two explanation types (feature- and example-based) for two prediction tasks to explore how decision-makers' intuition affects their use of AI predictions and explanations, and ultimately their choice of when to rely on AI. Our results identify three types of intuition involved in reasoning about AI predictions and explanations: intuition about the task outcome, features, and AI limitations. Building on these, we summarize three observed pathways for decision-makers to apply their own intuition and override AI predictions. We use these pathways to explain why (1) the feature-based explanations we used did not improve participants' decision outcomes and increased their overreliance on AI, and (2) the example-based explanations we used improved decision-makers' performance over feature-based explanations and helped achieve complementary human-AI performance. Overall, our work identifies directions for further development of AI decision-support systems and explanation methods that help decision-makers effectively apply their intuition to achieve appropriate reliance on AI.
A Theoretical Framework for AI Models Explainability with Application in Biomedicine
Rizzo, Matteo, Veneri, Alberto, Albarelli, Andrea, Lucchese, Claudio, Nobile, Marco, Conati, Cristina
EXplainable Artificial Intelligence (XAI) is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the subject, yet XAI still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation that is a synthesis of what can be found in the literature. We recognize that explanations are not atomic but the combination of evidence stemming from the model and its input-output mapping, and the human interpretation of this evidence. Furthermore, we fit explanations into the properties of faithfulness (i.e., the explanation being a true description of the model's inner workings and decision-making process) and plausibility (i.e., how much the explanation looks convincing to the user). Using our proposed theoretical framework simplifies how these properties are operationalized and it provides new insight into common explanation methods that we analyze as case studies.
iPDP: On Partial Dependence Plots in Dynamic Modeling Scenarios
Muschalik, Maximilian, Fumagalli, Fabian, Jagtani, Rohit, Hammer, Barbara, Hüllermeier, Eyke
Post-hoc explanation techniques such as the well-established partial dependence plot (PDP), which investigates feature dependencies, are used in explainable artificial intelligence (XAI) to understand black-box machine learning models. While many real-world applications require dynamic models that constantly adapt over time and react to changes in the underlying distribution, XAI, so far, has primarily considered static learning environments, where models are trained in a batch mode and remain unchanged. We thus propose a novel model-agnostic XAI framework called incremental PDP (iPDP) that extends on the PDP to extract time-dependent feature effects in non-stationary learning environments. We formally analyze iPDP and show that it approximates a time-dependent variant of the PDP that properly reacts to real and virtual concept drift. The time-sensitivity of iPDP is controlled by a single smoothing parameter, which directly corresponds to the variance and the approximation error of iPDP in a static learning environment. We illustrate the efficacy of iPDP by showcasing an example application for drift detection and conducting multiple experiments on real-world and synthetic data sets and streams.