AITopics | initialised deep network

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Neural Information Processing SystemsDec-24-2025, 16:51:56 GMT

Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, we derive a meaningful lower rank bound in deep linear networks. Empirically, we also demonstrate that this rank robustness generalizes to ReLU nets. Finally, we conduct an extensive set of experiments on real-world data sets, which confirm that rank stability is indeed a crucial condition for training modern-day deep neural architectures.

initialised deep network, name change, rank collapse, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Neural Information Processing SystemsMay-27-2025, 12:59:17 GMT

Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, we derive a meaningful lower rank bound in deep linear networks.

artificial intelligence, machine learning, rank collapse, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Review for NeurIPS paper: Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Neural Information Processing SystemsFeb-6-2025, 18:18:29 GMT

Weaknesses: While there are not many technical or experimental weaknesses in this paper, I wonder whether rank preserving transformations are important in other learning models - say linear ones or kernel machines, etc. It could be the case that this is a phenomenon exclusive to deep networks and other models are not relevant. Another issue is that in the case of binary classification one could still perform the task when rank collapse happens, as long as the relevant discriminatory signal is captured by the principal direction that the data is collapsed to. I would like to know if the authors agree or disagree with this hypothetical. Finally I do not think the authors address the case where the networks are overparameterized and d N in each layer.

initialised deep network, neurips paper, rank collapse

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Neural Information Processing SystemsOct-11-2024, 11:22:30 GMT

Randomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, we derive a meaningful lower rank bound in deep linear networks.

batch normalization, initialised deep network, rank collapse, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Filters

Collaborating Authors

initialised deep network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Review for NeurIPS paper: Batch normalization provably avoids ranks collapse for randomly initialised deep networks

Batch normalization provably avoids ranks collapse for randomly initialised deep networks