Embedding Geometries of Contrastive Language-Image Pre-Training

Sep-19-2024–arXiv.org Artificial Intelligence

Since the publication of CLIP, the approach of using InfoNCE loss for contrastive pre-training has become widely popular for bridging two or more modalities. Despite its wide adoption, CLIP's original design choices of L2 normalization and cosine similarity logit have rarely been revisited. We have systematically experimented with alternative geometries and softmax logits for language-image pre-training and identified that variants with intuitive Euclidean geometry, Euclidean CLIP (EuCLIP), match or exceed the performance of CLIP and support hierarchical relationships at least as well as more complicated hyperbolic alternative.

caption, embedding geometry, geometry, (14 more...)

arXiv.org Artificial Intelligence

Sep-19-2024

arXiv.org PDF

Add feedback

Country:
- Pacific Ocean > North Pacific Ocean
  - San Francisco Bay > Golden Gate (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - New York (0.04)
    - Colorado (0.04)
    - Washington > King County
      - Seattle (0.04)
    - Utah > Salt Lake County
      - Salt Lake City (0.04)
    - North Carolina > Durham County
      - Durham (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - California
      - Los Angeles County > Long Beach (0.04)
      - San Francisco County > San Francisco (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Norway (0.04)
  - Austria (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
  - France > Île-de-France
    - Paris > Paris (0.04)
- Asia > Middle East
  - Jordan (0.04)
- Africa > Rwanda
  - Kigali > Kigali (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Media > Photography (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.49)
  - Machine Learning
    - Statistical Learning (0.49)
    - Neural Networks > Deep Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found