The Upper Bound on Knots in Neural Networks

Chen, Kevin K.

arXiv.org Machine Learning 

In recent years, neural networks--and deep neural networks in particular--have succeeded exceedingly well in such a great plethora of data-driven problems, so as to herald an entire paradigm shift in the way data science is approached. Many everyday computerized tasks--such as image and optical character recognition, the personalization of Internet search results and advertisements, and even playing games such as chess, backgammon, and Go--have been deeply impacted and vastly improved by the application of neural networks. The applications of neural networks, however, have advanced significantly more rapidly than the theoretical understanding of their successes. Elements of neural network structures--such as the division of vector spaces into convex polytopes, and the application of nonlinear activation functions--afford neural networks a great flexibility to model many classes of functions with spectacular accuracy. The flexibility is embodied in universal approximation theorems (Cybenko 1989; Hornik et al. 1989; Hornik 1991; Sonoda and Murata 2015), which essentially state that neural networks can model any continuous function arbitrarily well. The complexity of neural networks, however, have also made their analytical understanding somewhat elusive. The general thrust of this paper, as well as two companion papers (Chen et al. 2016b,a), is to explore some unsolved elements of neural network theory, and to do so in a way that is independent of specific problems. In the broadest sense, we seek to understand what models neural networks are capable of producing. There exist many variations of neural networks, such as convolutional neural networks, recurrent neural networks, and long short-term memory models, each having their own arenas of success.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found