Review for NeurIPS paper: Unsupervised Learning of Dense Visual Representations

Neural Information Processing Systems 

A key limitation of this work is that their proposed network VADeR is always initialized with MOCO self-supervised pre-training. While this is benign for practical purposes, it does conflate the two methods, and also means that VADeR is trained for longer etc. Training randomly initialized network with the proposed method will provide crucial empirical evidence, and would only strengthen, not weaken the experiments and claims.