Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

Daniely, Amit, Frostig, Roy, Singer, Yoram

Neural Information Processing Systems 

We develop a general duality between neural networks and compositional kernel Hilbert spaces. We introduce the notion of a computation skeleton, an acyclic graph that succinctly describes both a family of neural networks and a kernel space. Random neural networks are generated from a skeleton through node replication followed by sampling from a normal distribution to assign weights. The kernel space consists of functions that arise by compositions, averaging, and non-linear transformations governed by the skeleton's graph topology and activation functions. We prove that random networks induce representations which approximate the kernel space.