Batch Normalization Biases Residual Blocks Towardsthe Identity Functionin Deep Networks

Open in new window