Batch Normalization Biases Deep Residual Networks Towards Shallow Paths

Open in new window