Understanding Visual Concepts Across Models
Trabucco, Brandon, Gurinas, Max, Doherty, Kyle, Salakhutdinov, Ruslan
–arXiv.org Artificial Intelligence
Large multimodal models such as Stable Diffusion can generate, detect, and classify new visual concepts after fine-tuning just a single word embedding. Do models learn similar words for the same concepts (i.e. = orange + cat)? We conduct a large-scale analysis on three state-of-the-art models in text-to-image generation, open-set object detection, and zero-shot classification, and find that new word embeddings are model-specific and non-transferable. Across 4,800 new embeddings trained for 40 diverse visual concepts on four standard datasets, we find perturbations within an $\epsilon$-ball to any prior embedding that generate, detect, and classify an arbitrary concept. When these new embeddings are spliced into new models, fine-tuning that targets the original model is lost. We show popular soft prompt-tuning approaches find these perturbative solutions when applied to visual concept learning tasks, and embeddings for visual concepts are not transferable. Code for reproducing our work is available at: https://visual-words.github.io.
arXiv.org Artificial Intelligence
Jun-11-2024
- Country:
- Africa > Rwanda
- Europe
- France
- Hauts-de-France > Nord
- Lille (0.04)
- Île-de-France > Paris
- Paris (0.04)
- Hauts-de-France > Nord
- Switzerland > Zürich
- Zürich (0.14)
- France
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Newfoundland and Labrador > Labrador (0.05)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Dominican Republic (0.04)
- United States
- Arizona > Maricopa County
- Scottsdale (0.04)
- California
- San Diego County > San Diego (0.04)
- Santa Clara County > San Jose (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.05)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Washington > King County
- Seattle (0.04)
- Arizona > Maricopa County
- Canada
- Genre:
- Research Report
- New Finding (0.46)
- Promising Solution (0.54)
- Research Report
- Technology: