Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

Neural Information Processing Systems 

Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found