Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations
Dreyer, Maximilian, Achtibat, Reduan, Samek, Wojciech, Lapuschkin, Sebastian
–arXiv.org Artificial Intelligence
Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly biased human assessment. In this work, we present a novel post-hoc concept-based XAI framework that conveys besides instance-wise (local) also class-wise (global) decision-making strategies via prototypes. What sets our approach apart is the combination of local and global strategies, enabling a clearer understanding of the (dis-)similarities in model decisions compared to the expected (prototypical) concept use, ultimately reducing the dependence on human long-term assessment. Quantifying the deviation from prototypical behavior not only allows to associate predictions with specific model sub-strategies but also to detect outlier behavior. As such, our approach constitutes an intuitive and explainable tool for model validation. We demonstrate the effectiveness of our approach in identifying out-of-distribution samples, spurious model behavior and data quality issues across three datasets (ImageNet, CUB-200, and CIFAR-10) utilizing VGG, ResNet, and EfficientNet architectures. Code is available on https://github.com/maxdreyer/pcx.
arXiv.org Artificial Intelligence
Nov-28-2023
- Country:
- Europe (0.67)
- North America > United States (0.28)
- Genre:
- Research Report (1.00)
- Industry:
- Government (0.93)
- Transportation (1.00)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.88)
- Performance Analysis > Accuracy (0.67)
- Statistical Learning (1.00)
- Natural Language (1.00)
- Vision (1.00)
- Machine Learning
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology