Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

Neural Information Processing Systems 

Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth.