Contrastive learning, multi-view redundancy, and linear models
Tosh, Christopher, Krishnamurthy, Akshay, Hsu, Daniel
Self-supervised learning is an empirically successful approach to unsupervised learning based on creating artificial supervised learning problems. A popular self-supervised approach to representation learning is contrastive learning, which leverages naturally occurring pairs of similar and dissimilar data points, or multiple views of the same data. This work provides a theoretical analysis of contrastive learning in the multi-view setting, where two views of each datum are available. The main result is that linear functions of the learned representations are nearly optimal on downstream prediction tasks whenever the two views provide redundant information about the label.
Aug-23-2020
- Country:
- North America > United States
- New York > New York County
- New York City (0.04)
- Illinois > Cook County
- Chicago (0.04)
- New York > New York County
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.64)