Revisiting ResNets: Improved Training and Scaling Strategies

Neural Information Processing Systems 

Novel computer vision architectures monopolize the spotlight, but the impact of the model architecture is often conflated with simultaneous changes to training methodology and scaling strategies.