How to Choose an Activation Function
Mhaskar, H. N., Micchelli, C. A..
–Neural Information Processing Systems
We study the complexity problem in artificial feedforward neural networks designed to approximate real valued functions of several real variables; i.e., we estimate the number of neurons in a network required to ensure a given degree of approximation to every function in a given function class. We indicate how to construct networks with the indicated number of neurons evaluating standard activation functions. Our general theorem shows that the smoother the activation function, the better the rate of approximation. 1 INTRODUCTION The approximation capabilities of feedforward neural networks with a single hidden layer has been studied by many authors, e.g., [1, 2, 5]. In [10], we have shown that such a network using practically any nonlinear activation function can approximate any continuous function of any number of real variables on any compact set to any desired degree of accuracy. A central question in this theory is the following.
Neural Information Processing Systems
Dec-31-1994