Reviews: The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

Jan-26-2025, 02:26:10 GMT–Neural Information Processing Systems

This well-written paper is the latest in a series of works which analyze how signals propagate in random neural networks, by analyzing mean and variance of activations and gradients given random inputs and weights. The technical accomplishment can be considered incremental with respect to this series of works. However, while the techniques used are not new, the performed analysis leads to new insights on the use of batch/layer normalization. In particular, the analysis provides a close look on mechanisms that lead to pathological sharpness on DNNs, showing that the mean subtraction is the main ingredient to counter these mechanisms. While these claims would have to be verified in more complicated settings (e.g. with more complicated distributions on inputs and weights), it is an important first step to know that they hold for such simple networks.

alleviating pathological sharpness, normalization method, wide neural network, (2 more...)

Neural Information Processing Systems

Jan-26-2025, 02:26:10 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)