Supplementary Material for Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

Neural Information Processing Systems 

In this section, we provide the details and hyperparameters for SwA V pretraining and transfer learning. A.1 Implementation details of SwA V training First, we provide a pseudo-code for SwA V training loop using two crops in Pytorch style: # C: prototypes (DxK) # model: convnet + projection head # temp: temperature for x in loader: # load a batch x with B samples x_t = t(x) # t is a random augmentation x_s = s(x) # s is a another random augmentation z = model(cat(x_t, x_s)) # embeddings: 2BxD scores = mm(z, C) # prototype scores: 2BxK scores_t = scores[:B] scores_s = scores[B:] # compute assignments with torch.no_grad(): The latter methods require sharing the feature matrix across all GPUs at every batch which might become a bottleneck when distributing across many GPUs. A.2 Data augmentation used in SwA V We obtain two different views from an image by performing crops of random sizes and aspect ratios. We use the 1% and 10% splits specified in the official code release of SimCLR.