Exchangeability and Kernel Invariance in Trained MLPs
Tsuchida, Russell, Roosta, Fred, Gallagher, Marcus
Despite the widespread usage of deep learning in applications (Mnih et al., 2015; Kalchbrenner et al., 2017; Silver et al., 2017; van den Oord et al., 2018), current theoretical understanding of deep networks continues to lag behind the pursued engineering outcomes. Recent theoretical contributions have considered networks in their randomized initial state, or made strong assumptions about the parameters or data during training. For example, Cho & Saul (2009); Daniely et al. (2016); Bach (2017); Tsuchida et al. (2018) analyze the kernels of neural networks with random IID distributions. Insightful analysis connecting signal propagation in deep networks to chaos have made similar assumptions (Poole et al., 2016; Raghu et al., 2017). Clearly the assumption of random IID weights is only valid when the network is in its random initial state.
Oct-27-2018