A Additional Results

Neural Information Processing Systems 

FID evaluated over 10k samples instead of 50k for efficiency. It is thus important to compare our method's compute requirements to competing methods. BigGAN-deep with the same or lower compute budget. We include communication time across two machines whenever our training batch size doesn't We find that a naive implementation of our models in PyTorch 1.7 is very inefficient, utilizing only Table 7: Throughput of our ImageNet models, measured in Images per V100-sec. In addition, we can train for many fewer iterations while maintaining sample quality superior to BigGAN-deep.