Tempering Backpropagation Networks: Not All Weights are Created Equal

Schraudolph, Nicol N., Sejnowski, Terrence J.

Dec-31-1996–Neural Information Processing Systems

Backpropagation learning algorithms typically collapse the network's structure into a single vector of weight parameters to be optimized. We suggest that their performance may be improved by utilizing the structural informationinstead of discarding it, and introduce a framework for ''tempering'' each weight accordingly. In the tempering model, activation and error signals are treated as approximately independentrandom variables. The characteristic scale of weight changes is then matched to that ofthe residuals, allowing structural properties suchas a node's fan-in and fan-out to affect the local learning rate and backpropagated error. The model also permits calculation of an upper bound on the global learning rate for batch updates, which in turn leads to different update rules for bias vs. non-bias weights. This approach yields hitherto unparalleled performance on the family relations benchmark,a deep multi-layer network: for both batch learning with momentum and the delta-bar-delta algorithm, convergence at the optimal learning rate is sped up by more than an order of magnitude.

artificial intelligence, learning rate, neural network, (15 more...)

Neural Information Processing Systems

Dec-31-1996

Conferences PDF

Add feedback

Country:
- North America > United States (0.69)

Industry:
- Education (0.36)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.65)

Duplicate Docs Excel Report

Title
Tempering Backpropagation Networks: Not All Weights are Created Equal
Tempering Backpropagation Networks: Not All Weights are Created Equal

Similar Docs Excel Report more

Title	Similarity	Source
None found