EMC$^2$: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence

Yau, Chung-Yiu, Wai, Hoi-To, Raman, Parameswaran, Sarkar, Soumajyoti, Hong, Mingyi

Apr-16-2024–arXiv.org Artificial Intelligence

Contrastive representation learning has been instrumental in self-supervised learning for large-scale pretraining of foundation models Radford et al. (2021); Cherti et al. (2023) as well as in the fine-tuning stage on downstream tasks Xiong et al. (2020); Lindgren et al. (2021). It helps encode real-world data into lowdimensional feature vectors that abstract the important attributes about the data, and generalize well outside of the training distribution. More recently, contrastive learning with multi-modal data has helped embed different data modalities into the same feature space Li et al. (2023), such as the studies with visual-language models Radford et al. (2021); Alayrac et al. (2022); Cherti et al. (2023) and document understanding Xu et al. (2020); Lee et al. (2023). Contrastive learning uses pairwise comparison of representations in the training objective, with the goal of learning representations of data where positive pairs are drawn closer while negative pairs move apart in the representation space. It is well known that generating a large dataset of pairwise samples such as image-text pairs of the same semantics costs much lower than manual labeling, e.g., the WebImageText dataset used for training CLIP originates from Wikipedia articles Radford et al. (2021).

artificial intelligence, emc 2, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Apr-16-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.14)
- North America > United States (0.14)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Supervised Learning > Representation Of Examples (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found