Neural networks: deep, shallow, or in between?

Petrova, Guergana, Wojtaszczyk, Przemyslaw

arXiv.org Machine Learning 

The fascinating new developments in the area of Artificial Intelligence (AI) and other important applications of neural networks prompt the need for a theoretical mathematical study of their potential to reliably approximate complicated objects. Various network architectures have been used in different applications with substantial success rates without significant theoretical backing of the choices made. Thus, a natural question to ask is whether and how the architecture chosen affects the approximation power of the outputs of the resulting neural network. In this paper, we attempt to clarify how the width and the depth of a feed-forward neural network affect its worst performance. More precisely, we provide estimates from below for the error of approximation of a compact subset K X of a Banach space X by the outputs of feedforward neural networks (NNs) with width W, depth l, bound w(W,l) on their parameters, and Lipschitz activation functions. Note that the ReLU function is included in our investigation since it is a Lipschitz function with a Lipschitz constant L = 1. To prove our results, we assume that we know lower bounds on the entropy numbers of the compact sets K that we approximate by the outputs of feed-forward NNs.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found