LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Prabhu, Viraj, Yenamandra, Sriram, Chattopadhyay, Prithvijit, Hoffman, Judy

Oct-27-2023–arXiv.org Artificial Intelligence

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet. Code is available at https://github.com/virajprabhu/lance.

caption, howler monkey, prediction, (12 more...)

arXiv.org Artificial Intelligence

Oct-27-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland > Zürich
    - Zürich (0.14)
- Asia
  - Middle East > Israel
    - Tel Aviv District > Tel Aviv (0.04)
  - Japan > Honshū
    - Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Sports (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found