feature attribution method
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Law (0.46)
- Health & Medicine > Diagnostic Medicine (0.46)
- Information Technology > Security & Privacy (0.46)
- Law (0.46)
- Government > Regional Government (0.46)
OpenXAI: Towards a Transparent Evaluation of Post hoc Model Explanations
While several types of post hoc explanation methods have been proposed in recent literature, there is very little work on systematically benchmarking these methods. Here, we introduce OpenXAI, a comprehensive and extensible open-source framework for evaluating and benchmarking post hoc explanation methods.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Jersey (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.70)
- Information Technology > Sensing and Signal Processing > Image Processing (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Stability Guarantees for Feature Attributions with Multiplicative Smoothing
Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process.In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothing method called Multiplicative Smoothing (MuS) to achieve such a model.We show that MuS overcomes the theoretical limitations of standard smoothing techniques and can be integrated with any classifier and feature attribution method.We evaluate MuS on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees.
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
Explaining the decisions of an Artificial Intelligence (AI) model is increasingly critical in many real-world, high-stake applications.Hundreds of papers have either proposed new feature attribution methods, discussed or harnessed these tools in their work.However, despite humans being the target end-users, most attribution methods were only evaluated on proxy automatic-evaluation metrics (Zhang et al. 2018; Zhou et al. 2016; Petsiuk et al. 2018). In this paper, we conduct the first user study to measure attribution map effectiveness in assisting humans in ImageNet classification and Stanford Dogs fine-grained classification, and when an image is natural or adversarial (i.e., contains adversarial perturbations). Overall, feature attribution is surprisingly not more effective than showing humans nearest training-set examples. On a harder task of fine-grained dog categorization, presenting attribution maps to humans does not help, but instead hurts the performance of human-AI teams compared to AI alone. Importantly, we found automatic attribution-map evaluation measures to correlate poorly with the actual human-AI team performance. Our findings encourage the community to rigorously test their methods on the downstream human-in-the-loop applications and to rethink the existing evaluation metrics.
\mathcal{M} 4 : A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models
While Explainable Artificial Intelligence (XAI) techniques have been widely studied to explain predictions made by deep neural networks, the way to evaluate the faithfulness of explanation results remains challenging, due to the heterogeneity of explanations for various models and the lack of ground-truth explanations. This paper introduces an XAI benchmark named $\mathcal{M}^4$, which allows evaluating various input feature attribution methods using the same set of faithfulness metrics across multiple data modalities (images and texts) and network structures (ResNets, MobileNets, Transformers). A taxonomy for the metrics has been proposed as well. We first categorize commonly used XAI evaluation metrics into three groups based on the ground truth they require. We then implement classic and state-of-the-art feature attribution methods using InterpretDL and conduct extensive experiments to compare methods and gain insights. Extensive experiments have been conducted to provide holistic evaluations as benchmark baselines. Several interesting observations are noticed for designing attribution algorithms.
Distribution-Based Feature Attribution for Explaining the Predictions of Any Classifier
The proliferation of complex, black-box AI models has intensified the need for techniques that can explain their decisions. Feature attribution methods have become a popular solution for providing post-hoc explanations, yet the field has historically lacked a formal problem definition. This paper addresses this gap by introducing a formal definition for the problem of feature attribution, which stipulates that explanations be supported by an underlying probability distribution represented by the given dataset. Our analysis reveals that many existing model-agnostic methods fail to meet this criterion, while even those that do often possess other limitations. To overcome these challenges, we propose Distributional Feature Attribution eXplanations (DFAX), a novel, model-agnostic method for feature attribution. DFAX is the first feature attribution method to explain classifier predictions directly based on the data distribution. We show through extensive experiments that DFAX is more effective and efficient than state-of-the-art baselines.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Law (0.46)
- Health & Medicine > Diagnostic Medicine (0.46)
- North America > United States > New Jersey (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.70)
- Information Technology > Sensing and Signal Processing > Image Processing (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)