Mean Field Residual Networks: On the Edge of Chaos

Feb-14-2020, 19:43:42 GMT–Neural Information Processing Systems

We study randomly initialized residual networks using mean field theory and the theory of difference equations. We show, in contrast, that by adding skip connections, the network will, depending on the nonlinearity, adopt subexponential forward and backward dynamics, and in many cases in fact polynomial. The exponents of these polynomials are obtained through analytic methods and proved and verified empirically to be correct. In terms of the "edge of chaos" hypothesis, these subexponential and polynomial laws allow residual networks to "hover over the boundary between stability and chaos," thus preserving the geometry of the input space and the gradient information flow. In our experiments, for each activation function we study here, we initialize residual networks with different hyperparameters and train them on MNIST.

chaos, mean field residual network, residual network, (2 more...)

Neural Information Processing Systems

Feb-14-2020, 19:43:42 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)