$\alpha$-Stable convergence of heavy-tailed infinitely-wide neural networks

Jung, Paul, Lee, Hoil, Lee, Jiho, Yang, Hongseok

Jun-17-2021–arXiv.org Machine Learning

We consider infinitely-wide multi-layer perceptrons (MLPs) which are limits of standard deep feed-forward neural networks. We assume that, for each layer, the weights of an MLP are initialized with i.i.d. samples from either a light-tailed (finite variance) or heavy-tailed distribution in the domain of attraction of a symmetric $\alpha$-stable distribution, where $\alpha\in(0,2]$ may depend on the layer. For the bias terms of the layer, we assume i.i.d. initializations with a symmetric $\alpha$-stable distribution having the same $\alpha$ parameter of that layer. We then extend a recent result of Favaro, Fortini, and Peluchetti (2020), to show that the vector of pre-activation values at all nodes of a given hidden layer converges in the limit, under a suitable scaling, to a vector of i.i.d. random variables with symmetric $\alpha$-stable distributions.

characteristic function, convergence, neural network, (14 more...)

arXiv.org Machine Learning

Jun-17-2021

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Colorado > Denver County
      - Denver (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > South Korea
  - Daejeon > Daejeon (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)