Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Yang, Xiaoyu, Lu, Jie, Yu, En

arXiv.org Artificial Intelligence 

Contrastive learning has proven to be highly effective in pre-training large-scale models, especially in large vision models exemplified by frameworks like SimCLR [1, 2], MoCo series [3, 4], DINO series [5, 6]. However, with the ongoing scaling of large models, data hunger for contrastive learning is raising more attention in the community towards pre-training effectively from drift data. It could be caused by long-tailed data, noise, and domain shift, where concept drift [7, 8] is utilized to uniformly summarize this phenomenon of unpredictable distribution changes in the pre-training through contrastive learning. Hence, a pertinent question emerges: beyond the existing contrastive learning methods, can contrastive paradigm learn from drift pre-training? In this work, we aim to bridge this gap by providing a systematic analysis of the above question. Our findings highlight critical vulnerabilities of the current contrastive pre-training paradigm in adapting to these challenges, underscoring the need for novel strategies to enhance their robustness in drift data streams. More related works are provided in Appendix A. Current contrastive pre-training methods predominantly adhere to the paradigm of comparing two distinct views of the same object, typically derived from different encoders.