Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off
Concept Bottleneck Models (or CBMs, [1]) have, since their inception in 2020, become a significant achievement in explainable AI. These models attempt to address the lack of human trust in AI by encouraging a deep neural network to be more interpretable by design. To this aim, CBMs first learn a mapping between their inputs (e.g., images of cats and dogs) and a set of "concepts" that correspond to high-level units of information commonly used by humans to describe what they see (e.g., "whiskers", "long tail", "black fur", etc…). This mapping function, which we will call the "concept encoder'', is learnt by -- you guessed it:)-- a differentiable and highly expressive model such as a deep neural network! It is through this concept encoder that CBMs then solves downstream tasks of interest (e.g., classify an image as a dog or a cat) by mapping concepts to output labels: The label predictor used to map concepts to task labels can be your favourite differentiable model, although in practice it tends to be something simple like a single fully connected layer.
Oct-13-2022, 18:50:22 GMT
- Technology: