Hyperbolic Image-Text Representations
Desai, Karan, Nickel, Maximilian, Rajpurohit, Tanmay, Johnson, Justin, Vedantam, Ramakrishna
–arXiv.org Artificial Intelligence
Visual and linguistic concepts naturally organize themselves in a hierarchy, where a textual concept "dog" entails all images that contain dogs. Despite being intuitive, current large-scale vision and language models such as CLIP do not explicitly capture such hierarchy. We propose MERU, a contrastive model that yields hyperbolic representations of images and text. Hyperbolic spaces have suitable geometric properties to embed tree-like data, so MERU can better capture the underlying hierarchy in image-text datasets. Our results show that MERU learns a highly interpretable and structured representation space while being competitive with CLIP's performance on standard multi-modal tasks like image classification and image-text retrieval.
arXiv.org Artificial Intelligence
Jun-5-2023
- Country:
- Africa
- Kenya > Lamu County
- Lamu (0.04)
- South Africa (0.04)
- Uganda (0.04)
- Kenya > Lamu County
- Asia
- Europe
- Slovenia > Coastal-Karst
- Municipality of Koper > Koper (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- Greece > Attica
- Athens (0.04)
- Norway (0.04)
- United Kingdom > England
- Cumbria (0.04)
- Isle of Wight (0.04)
- Kent > Dover (0.04)
- Denmark (0.04)
- Poland (0.04)
- Bulgaria (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Austria > Vienna (0.14)
- Slovenia > Coastal-Karst
- North America
- Canada
- Alberta (0.04)
- Newfoundland and Labrador > Labrador (0.04)
- Ontario > Toronto (0.04)
- Mexico > Jalisco
- Tlaquepaque (0.04)
- United States
- Ohio > Greene County
- Fairborn (0.04)
- California
- Alameda County > Oakland (0.04)
- San Francisco County > San Francisco (0.14)
- North Dakota > Billings County (0.04)
- Texas > Galveston County
- Galveston (0.04)
- Michigan (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Arizona (0.04)
- New York
- Bronx County > New York City (0.04)
- Kings County > New York City (0.04)
- New York County > New York City (0.14)
- Queens County > New York City (0.04)
- Richmond County > New York City (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Alaska
- Juneau City and Borough > Juneau (0.04)
- Prince of Wales-Hyder Census Area > Craig (0.04)
- Ohio > Greene County
- Canada
- Pacific Ocean > North Pacific Ocean
- San Francisco Bay > Golden Gate (0.04)
- Africa
- Genre:
- Research Report > New Finding (0.86)
- Industry:
- Consumer Products & Services (1.00)
- Energy (0.67)
- Leisure & Entertainment (1.00)
- Media > Photography (0.46)
- Transportation (0.67)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (0.93)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Machine Learning
- Communications > Social Media (0.93)
- Sensing and Signal Processing > Image Processing (1.00)
- Artificial Intelligence
- Information Technology