Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning

Feb-20-2020–arXiv.org Machine Learning

Unsupervised representation learning has achieved enormous success in practical applications, especially in natural language processing, such as the famous word2vec (Mikolov et al., 2013) and the groundbreaking advent of BERT (Devlin et al., 2019) and its variants as unsupervised pretrained language models. Among the unsupervised learning approaches, contrastive learning has gained increasing attention in the deep learning community. More surprisingly, as shown by He et al. (2019), unsupervised contrastively pretrained models can outperform their supervised counterparts in many downstream vision tasks, suggesting that the area of computer vision, which was previously dominated by supervised pretraining, can also benefit from unsupervised pretraining. Beyond these conventional approaches, unsupervised contrastive learning has also been employed in a variety of novel applications such as layer-wise representation learning (Löwe et al., 2019) and representation learning of the actual world (Kipf et al., 2019). These studies together reflect the popularity and capability of the unsupervised contrastive methods. In this paper, we view the unsupervised contrastive learning as a pretraining method, where the goal is to obtain pretrained representations that can be transferred to downstream tasks via fine-tuning. The benefit of doing unsupervised rather than supervised learning is its capability of leveraging the unlabeled data, which are more accessible and inexpensive relative to the labeled data. Developing and understanding unsupervised pretraining methods are necessary due to these limitations.

contrastive learning, learning, probability, (14 more...)

arXiv.org Machine Learning

Feb-20-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report (0.63)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found