edcnn
Optimal Approximation and Learning Rates for Deep Convolutional Neural Networks
One of the most important reasons for its success is the architecture (or structure) [2] which autonomously encodes the a-priori information in the network and significantly reduces the number of free parameters simultaneously to improve the learning performance. Deep convolutional neural networks (DCNNs), a widely used structured deep neural networks, have been triggered enormous research activities in both applications [3, 4, 5] and theoretical analysis [6, 7, 8]. In this paper, we focus on approximation and learning performance analysis for DC-NNs induced by the rectifier linear unit (ReLU) σ(t) = max{t,0}.
Deep Convolutional Neural Networks with Zero-Padding: Feature Extraction and Learning
Han, Zhi, Liu, Baichen, Lin, Shao-Bo, Zhou, Ding-Xuan
Abstract--This paper studies the performance of deep convolutional neural networks (DCNNs) with zero-padding in feature extraction and learning. After verifying the roles of zero-padding in enabling translation-equivalence, and pooling in its translation-invariance driven nature, we show that with similar number of free parameters, any deep fully connected networks (DFCNs) can be represented by DCNNs with zero-padding. This demonstrates that DCNNs with zero-padding is essentially better than DFCNs in feature extraction. Consequently, we derive universal consistency of DCNNs with zero-padding and show its translation-invariance in the learning process. All our theoretical results are verified by numerical experiments including both toy simulations and real-data running. For example, the kernel approach [6] utilizes the kernel-based mapping to extract features and deep learning [7] employs deep neural networks (deep nets and hierarchy-structure [24].