Causally Reliable Concept Bottleneck Models
De Felice, Giovanni, Flores, Arianna Casanova, De Santis, Francesco, Santini, Silvia, Schneider, Johannes, Barbiero, Pietro, Termine, Alberto
–arXiv.org Artificial Intelligence
Concept-based models are an emerging paradigm in deep learning that constrains the inference process to operate through human-interpretable concepts, facilitating explainability and human interaction. However, these architectures, on par with popular opaque neural models, fail to account for the true causal mechanisms underlying the target phenomena represented in the data. This hampers their ability to support causal reasoning tasks, limits out-of-distribution generalization, and hinders the implementation of fairness constraints. To overcome these issues, we propose \emph{Causally reliable Concept Bottleneck Models} (C$^2$BMs), a class of concept-based architectures that enforce reasoning through a bottleneck of concepts structured according to a model of the real-world causal mechanisms. We also introduce a pipeline to automatically learn this structure from observational data and \emph{unstructured} background knowledge (e.g., scientific literature). Experimental evidence suggest that C$^2$BM are more interpretable, causally reliable, and improve responsiveness to interventions w.r.t. standard opaque and concept-based models, while maintaining their accuracy.
arXiv.org Artificial Intelligence
Mar-6-2025
- Country:
- North America > United States
- Colorado (0.04)
- California > Santa Clara County
- Santa Clara (0.04)
- Europe
- Liechtenstein (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Asia > Japan
- Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Technology:
- Information Technology > Artificial Intelligence
- Representation & Reasoning > Uncertainty (1.00)
- Natural Language > Large Language Model (0.96)
- Vision (0.93)
- Cognitive Science (0.93)
- Machine Learning
- Statistical Learning (1.00)
- Neural Networks > Deep Learning (1.00)
- Learning Graphical Models > Directed Networks
- Bayesian Learning (0.68)
- Information Technology > Artificial Intelligence