Appendix

Neural Information Processing Systems 

We trained ResNet-50, for 1.1m iterations. We used an SGD optimizer, with a 0.03 learning rate,32 batch size,0.9