What do we learn from inverting CLIP models?

Kazemi, Hamid, Chegini, Atoosa, Geiping, Jonas, Feizi, Soheil, Goldstein, Tom

Mar-4-2024–arXiv.org Artificial Intelligence

We employ an inversion-based approach to examine CLIP models. Our examination reveals that inverting CLIP models results in the generation of images that exhibit semantic alignment with the specified target prompts. We leverage these inverted images to gain insights into various aspects of CLIP models, such as their ability to blend concepts and inclusion of gender biases. We notably observe instances of NSFW (Not Safe For Work) images during model inversion. This phenomenon occurs even for semantically innocuous prompts, like "a beautiful landscape," as well as for prompts involving the names of celebrities. Warning: This paper contains sexually explicit images and language, offensive visuals and terminology, discussions on pornography, gender bias, and other potentially unsettling, distressing, and/or offensive content for certain readers.

arxiv preprint arxiv, clip model, inversion, (14 more...)

arXiv.org Artificial Intelligence

Mar-4-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Maryland (0.04)
- Europe
  - Poland (0.04)
  - Germany > Baden-Württemberg
    - Tübingen Region > Tübingen (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (0.68)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found