Mehler's Formula, Branching Process, and Compositional Kernels of Deep Neural Networks
Liang, Tengyuan, Tran-Bach, Hai
Kernel methods and deep neural networks are arguably two representative methods that achieved the state-of-the-art results in regression and classification tasks. However, unlike the kernel methods where both the statistical and computational aspects of learning have been understood reasonably well, there are still many theoretical puzzles around the generalization, computation and representation aspects of deep neural networks (Zhang et al., 2017). One hopeful direction to resolve some of the puzzles in neural networks is through the lens of kernels (Rahimi and Recht, 2008, 2009; Cho and Saul, 2009; Belkin et al., 2018b). Such a connection can be readily observed in a two-layer infinite-width network with random weights, see the pioneering work by Neal (1996a) and (Rahimi and Recht, 2008, 2009). For deep networks with hierarchical structures and randomly initialized weights, compositional kernels (Daniely et al., 2017b,b) are proposed to rigorously characterize such a connection, with promising empirical performances (Cho and Saul, 2009).
Apr-9-2020
- Country:
- Europe
- Netherlands > South Holland
- Dordrecht (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Netherlands > South Holland
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- New York > New York County
- New York City (0.14)
- Illinois > Cook County
- Europe
- Genre:
- Research Report (0.50)
- Technology: