Neural networks: deep, shallow, or in between?

Petrova, Guergana, Wojtaszczyk, Przemyslaw

Oct-11-2023–arXiv.org Machine Learning

The fascinating new developments in the area of Artificial Intelligence (AI) and other important applications of neural networks prompt the need for a theoretical mathematical study of their potential to reliably approximate complicated objects. Various network architectures have been used in different applications with substantial success rates without significant theoretical backing of the choices made. Thus, a natural question to ask is whether and how the architecture chosen affects the approximation power of the outputs of the resulting neural network. In this paper, we attempt to clarify how the width and the depth of a feed-forward neural network affect its worst performance. More precisely, we provide estimates from below for the error of approximation of a compact subset K X of a Banach space X by the outputs of feedforward neural networks (NNs) with width W, depth l, bound w(W,l) on their parameters, and Lipschitz activation functions. Note that the ReLU function is included in our investigation since it is a Lipschitz function with a Lipschitz constant L = 1. To prove our results, we assume that we know lower bounds on the entropy numbers of the compact sets K that we approximate by the outputs of feed-forward NNs.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

Oct-11-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas > Brazos County > College Station (0.14)
- Europe
  - Poland (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found