Worst-case Loss Bounds for Single Neurons

Helmbold, David P., Kivinen, Jyrki, Warmuth, Manfred K.

Dec-31-1996–Neural Information Processing Systems

We analyze and compare the well-known Gradient Descent algorithm anda new algorithm, called the Exponentiated Gradient algorithm, for training a single neuron with an arbitrary transfer function . Both algorithms are easily generalized to larger neural networks, and the generalization of Gradient Descent is the standard back-propagationalgorithm. In this paper we prove worstcase loss bounds for both algorithms in the single neuron case. Since local minima make it difficult to prove worst-case bounds for gradient-based algorithms, we must use a loss function that prevents the formation of spurious local minima. We define such a matching loss function for any strictly increasing differentiable transfer function and prove worst-case loss bound for any such transfer function and its corresponding matching loss. For example, thematching loss for the identity function is the square loss and the matching loss for the logistic sigmoid is the entropic loss. The different structure of the bounds for the two algorithms indicates thatthe new algorithm outperforms Gradient Descent when the inputs contain a large number of irrelevant components.

algorithm, artificial intelligence, neural network, (18 more...)

Neural Information Processing Systems

Dec-31-1996

Conferences PDF

Add feedback

Country:
- North America > United States > California > Santa Cruz County > Santa Cruz (0.15)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.89)
  - Statistical Learning (0.91)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found