Goto

Collaborating Authors

 edcnn


Optimal Approximation and Learning Rates for Deep Convolutional Neural Networks

arXiv.org Artificial Intelligence

One of the most important reasons for its success is the architecture (or structure) [2] which autonomously encodes the a-priori information in the network and significantly reduces the number of free parameters simultaneously to improve the learning performance. Deep convolutional neural networks (DCNNs), a widely used structured deep neural networks, have been triggered enormous research activities in both applications [3, 4, 5] and theoretical analysis [6, 7, 8]. In this paper, we focus on approximation and learning performance analysis for DC-NNs induced by the rectifier linear unit (ReLU) σ(t) = max{t,0}.


Deep Convolutional Neural Networks with Zero-Padding: Feature Extraction and Learning

arXiv.org Artificial Intelligence

Abstract--This paper studies the performance of deep convolutional neural networks (DCNNs) with zero-padding in feature extraction and learning. After verifying the roles of zero-padding in enabling translation-equivalence, and pooling in its translation-invariance driven nature, we show that with similar number of free parameters, any deep fully connected networks (DFCNs) can be represented by DCNNs with zero-padding. This demonstrates that DCNNs with zero-padding is essentially better than DFCNs in feature extraction. Consequently, we derive universal consistency of DCNNs with zero-padding and show its translation-invariance in the learning process. All our theoretical results are verified by numerical experiments including both toy simulations and real-data running. For example, the kernel approach [6] utilizes the kernel-based mapping to extract features and deep learning [7] employs deep neural networks (deep nets and hierarchy-structure [24].