A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation

Bordes, Florian, Lavoie, Samuel, Balestriero, Randall, Ballas, Nicolas, Vincent, Pascal

Apr-11-2023–arXiv.org Artificial Intelligence

Self-Supervised Learning (SSL) models rely on a pretext task to learn representations. Because this pretext task differs from the downstream tasks used to evaluate the performance of these models, there is an inherent misalignment or pretraining bias. A commonly used trick in SSL, shown to make deep networks more robust to such bias, is the addition of a small projector (usually a 2 or 3 layer multi-layer perceptron) on top of a backbone network during training. In contrast to previous work that studied the impact of the projector architecture, we here focus on a simpler, yet overlooked lever to control the information in the backbone representation. We show that merely changing its dimensionality -- by changing only the size of the backbone's very last block -- is a remarkably effective technique to mitigate the pretraining bias. It significantly improves downstream transfer performance for both Self-Supervised and Supervised pretrained models.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

Apr-11-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - Ontario > Toronto (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Asia > Middle East
  - Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (0.87)
  - Neural Networks > Perceptrons (0.54)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found