Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Rohwer, Richard

Neural Information Processing Systems 

In 4 of these methods the gradient is divided component-wise by a decaying average of either the second derivatives or their absolute values.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found