On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Saada, Thiziri Nait, Tanner, Jared
–arXiv.org Artificial Intelligence
The edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks are analyzed. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.
arXiv.org Artificial Intelligence
Jan-31-2023