Reviews: Self-Normalizing Neural Networks

Oct-7-2024, 22:12:03 GMT–Neural Information Processing Systems

The paper proposes a new (class of) activation function f to more efficiently train very deep feed-forward neural networks (networks with f are called SNNs). The authors argue that 1) SNNs converge towards normalized activation distributions 2) SGD is more stable as the SNN approx preserves variance from layer to layer. In fact, f is part of a family of activation functions, for which theoretical guarantees for fixed-point convergence exist. These functions are contraction mappings and characterized by mean/variance preservation across layers at the fixed point -- solving these constraints allows finding other "self-normalizing" f, in principle. Whether f converges to a fixed point, is sensitive to the choice of hyper-parameters: the authors demonstrate certain weight initializations and parameters settings that give the fixed-point behavior.

review, self-normalizing neural network, snn, (3 more...)

Neural Information Processing Systems

Oct-7-2024, 22:12:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)