Interpolation property of shallow neural networks

Constantinescu, Vlad-Raul, Popescu, Ionel

arXiv.org Artificial Intelligence 

We study the geometry of global minima of the loss landscape of overparametrized neural networks. In the light of the interpolation threshold outlined in [1] one of the important issues in neural networks is to have guarantees that the interpolation is indeed achieved. We tackle this problem for the case of shallow neural network and show that this holds true in general as long as the activation is not a polynomial of low degree. Standard optimization problems are done in the case the loss function is convex in which case we only have a global minima. Another class of optimization problems is for nonconvex loss functions which has a discrete number of global minima. Recently there has been interesting progress aimed at understanding the locus of the global minima for overparametrized neural networks ([3], [6]) when the activation function is continuous. In this paper, we generalize these results in section 2 for a larger class of activation functions. More precisely, we prove that in the overparametrized regime, we can interpolate any data set consisting of d points with a shallow neural network having at least d neurons on the hidden layer and with an activation function which is locally integrable and not almost everywhere a polynomial of degree at most d 2. In addition, if the activation function is also smooth, the locus of global minima of the loss landscape of an over-parametrized neural network is a submanifold of R

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found