Mehler's Formula, Branching Process, and Compositional Kernels of Deep Neural Networks

Liang, Tengyuan, Tran-Bach, Hai

arXiv.org Machine Learning 

Kernel methods and deep neural networks are arguably two representative methods that achieved the state-of-the-art results in regression and classification tasks. However, unlike the kernel methods where both the statistical and computational aspects of learning have been understood reasonably well, there are still many theoretical puzzles around the generalization, computation and representation aspects of deep neural networks (Zhang et al., 2017). One hopeful direction to resolve some of the puzzles in neural networks is through the lens of kernels (Rahimi and Recht, 2008, 2009; Cho and Saul, 2009; Belkin et al., 2018b). Such a connection can be readily observed in a two-layer infinite-width network with random weights, see the pioneering work by Neal (1996a) and (Rahimi and Recht, 2008, 2009). For deep networks with hierarchical structures and randomly initialized weights, compositional kernels (Daniely et al., 2017b,b) are proposed to rigorously characterize such a connection, with promising empirical performances (Cho and Saul, 2009).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found