Goto

Collaborating Authors

marcotcr/lime

#artificialintelligence

This project is about explaining what machine learning classifiers (or models) are doing. At the moment, we support explaining individual predictions for text classifiers, with a package caled lime (short for local interpretable model-agnostic explanations). Lime is based on the work presented in this paper. Our plan is to add more packages that help users understand and interact meaningfully with machine learning. Lime is able to explain any black box text classifier, with two or more classes.


Probabilistic Sufficient Explanations

arXiv.org Artificial Intelligence

Understanding the behavior of learned classifiers is an important task, and various black-box explanations, logical reasoning approaches, and model-specific methods have been proposed. In this paper, we introduce probabilistic sufficient explanations, which formulate explaining an instance of classification as choosing the "simplest" subset of features such that only observing those features is "sufficient" to explain the classification. That is, sufficient to give us strong probabilistic guarantees that the model will behave similarly when all features are observed under the data distribution. In addition, we leverage tractable probabilistic reasoning tools such as probabilistic circuits and expected predictions to design a scalable algorithm for finding the desired explanations while keeping the guarantees intact. Our experiments demonstrate the effectiveness of our algorithm in finding sufficient explanations, and showcase its advantages compared to Anchors and logical explanations.


Understanding how to explain predictions with "explanation vectors"

#artificialintelligence

In a recent post I introduced three existing approaches to explain individual predictions of any machine learning model. After the posts focused on LIME and Shapley values, now it's the turn of Explanation vectors, a method presented by David Baehrens, Timon Schroeter and Stefan Harmeling in 2010. As we have seen in the mentioned posts, explaining a decision of a black box model implies understanding what input features made the model give its prediction for the observation being explained. Intuitively, a feature has a lot of influence on the model decision if small variations in its value cause large variations of the model's output, while a feature has little influence on the prediction if big changes in that variable barely affect the model's output. Since a model is a scalar function, its gradient points in the direction of the greatest rate of increase of the model's output, so it can be used as a measure of features' influence.


marcotcr/lime

#artificialintelligence

This project is about explaining what machine learning classifiers (or models) are doing. At the moment, we support explaining individual predictions for text classifiers or classifiers that act on tables (numpy arrays of numerical or categorical data), with a package caled lime (short for local interpretable model-agnostic explanations). Lime is based on the work presented in this paper. Our plan is to add more packages that help users understand and interact meaningfully with machine learning. Lime is able to explain any black box text classifier, with two or more classes.


Influence-Driven Explanations for Bayesian Network Classifiers

arXiv.org Artificial Intelligence

One of the most pressing issues in AI in recent years has been the need to address the lack of explainability of many of its models. We focus on explanations for discrete Bayesian network classifiers (BCs), targeting greater transparency of their inner workings by including intermediate variables in explanations, rather than just the input and output variables as is standard practice. The proposed influence-driven explanations (IDXs) for BCs are systematically generated using the causal relationships between variables within the BC, called influences, which are then categorised by logical requirements, called relation properties, according to their behaviour. These relation properties both provide guarantees beyond heuristic explanation methods and allow the information underpinning an explanation to be tailored to a particular context's and user's requirements, e.g., IDXs may be dialectical or counterfactual. We demonstrate IDXs' capability to explain various forms of BCs, e.g., naive or multi-label, binary or categorical, and also integrate recent approaches to explanations for BCs from the literature. We evaluate IDXs with theoretical and empirical analyses, demonstrating their considerable advantages when compared with existing explanation methods.