Review for NeurIPS paper: Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases

Neural Information Processing Systems 

While it is interesting that self-supervised methods are more invariant to occlusion, it is unclear why they wouldn't also be more invariant to the other augmentations used during training. For example, supervised learning appears more invariant to "Illumination Color" (Top-25 category) despite self-supervised learning methods using aggressive color augmentation techniques. This discrepancy is not discussed and we are left wondering what it means. Next, while the analysis of transfer performance as a function of cropped vs. original training and test datasets is interesting, it is unclear whether the results really support the authors' interpretation. They find that training and testing on the same type of images (i.e. This is to be expected, as this minimizes the domain gap between training and testing.