Theoretical Analysis of the Inductive Biases in Deep Convolutional Networks

Neural Information Processing Systems 

In this paper, we provide a theoretical analysis of the inductive biases in convolutional neural networks (CNNs). We start by examining the universality of CNNs, i.e., the ability to approximate any continuous functions. We prove that a depth of \mathcal{O}(\log d) suffices for deep CNNs to achieve this universality, where d in the input dimension. Additionally, we establish that learning sparse functions with CNNs requires only \widetilde{\mathcal{O}}(\log 2d) samples, indicating that deep CNNs can efficiently capture {\em long-range} sparse correlations. These results are made possible through a novel combination of the multichanneling and downsampling when increasing the network depth.