Review for NeurIPS paper: Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Neural Information Processing Systems 

The reviewers generally agree this paper has great execution, a great idea, and great results. The reviewers noted the impact that self-supervised learning on video can have, which has been less explored than the image counterpart. The reviewers also praised the strong empirical results, which will be of high interest to the community.