Neural networks: deep, shallow, or in between?
Petrova, Guergana, Wojtaszczyk, Przemyslaw
The fascinating new developments in the area of Artificial Intelligence (AI) and other important applications of neural networks prompt the need for a theoretical mathematical study of their potential to reliably approximate complicated objects. Various network architectures have been used in different applications with substantial success rates without significant theoretical backing of the choices made. Thus, a natural question to ask is whether and how the architecture chosen affects the approximation power of the outputs of the resulting neural network. In this paper, we attempt to clarify how the width and the depth of a feed-forward neural network affect its worst performance. More precisely, we provide estimates from below for the error of approximation of a compact subset K X of a Banach space X by the outputs of feedforward neural networks (NNs) with width W, depth l, bound w(W,l) on their parameters, and Lipschitz activation functions. Note that the ReLU function is included in our investigation since it is a Lipschitz function with a Lipschitz constant L = 1. To prove our results, we assume that we know lower bounds on the entropy numbers of the compact sets K that we approximate by the outputs of feed-forward NNs.
Oct-11-2023
- Country:
- North America > United States
- Texas > Brazos County > College Station (0.14)
- Europe
- Poland (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.34)
- Technology: