Goldilocks Neural Networks

Rosenzweig, Jan, Cvetkovic, Zoran, Roenzweig, Ivana

arXiv.org Machine Learning 

Training deep neural networks is an important problem which is still far from solved. At the core of the problem is our still relatively poor understanding of what happens under the hood of a deep neural network. Practically, this translates to a wide variety of deep network architectures and activation functions used in them. They all, however, suffer from the same problem when it comes to interpretability. It is next to impossible to understand how and why even a single layer network performs a simple classification task, and this probelm only increases with the size and the depth of the network. Activation functions stem from Cybenko's seminal 1989 paper [1], which proved that sigmoidal functions are universal approximators. This gave rise to a number of sigmoidal activation functions, including the sigmoid, tanh, arctan, binary step, Elliott sign [2], SoftSign [3] [4], SQNL [5], soft clipping [6] and many others. Sigmoidal activations were useful in the early days of neural networks, but the most serious problem that they suffered from was vanishing gradients.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found