A Experimental Details

Neural Information Processing Systems 

A.1 Networks used for comparison A.2 CIFAR-10: ResNets: We train a variety of ResNets for comparing representations. The base ResNet architecture for all our experiments is ResNet-18 [He et al., 2015] adapted to CIFAR-10 dimensions with 64 filters in the first convolutional layer. We also train a wider ResNet-w2x and narrower ResNet-0.5x For the deep ResNet, we train a ResNet-164 [He et al., 2015]. For the experiments with varying number of samples or training epochs, we train the base ResNet-18 with the specified number of samples and epochs.