Towards learning to explain with concept bottleneck models: mitigating information leakage

Lockhart, Joshua, Marchesotti, Nicolas, Magazzeni, Daniele, Veloso, Manuela

Nov-7-2022–arXiv.org Artificial Intelligence

Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. Then a downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction. Model trust issues emerge in this paradigm when soft concept labels are used: it has previously been observed that extra information about the data distribution leaks into the concept predictions. In this work we show how Monte-Carlo Dropout can be used to attain soft concept predictions that do not contain leaked information.

artificial intelligence, concept label, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Nov-7-2022

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England > Greater London > London (0.05)
- North America > United States
  - New York > New York County > New York City (0.04)

Genre:
- Research Report (0.50)

Industry:
- Banking & Finance (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found