Reviews: On the Local Hessian in Back-propagation

Oct-8-2024, 01:43:18 GMT–Neural Information Processing Systems

They propose that backpropagation with respect to a loss function is equivalent to a single step of a "back-matching propagation" procedure in which, after a forward evaluation, we alternately optimize the weights and input activations for each block to minimize a loss for the block's output. The authors propose that architectures and training procedures which improve the condition number of the Hessian of this back-matching loss are more efficient and support this by analytically studying the effects of orthonormal initialization, skip connections, and batch-norm. They offer further evidence for this characterization by designing a blockwise learning-rate scaling method based on an approximation of the backmatching loss and demonstrating an improved learning curve for VGG13 on CIFAR10 and CIFAR100.

back-propagation, hessian, procedure, (7 more...)

Neural Information Processing Systems

Oct-8-2024, 01:43:18 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)