a57ecd54d4df7d999bd9c5e3b973ec75-Supplemental.pdf

Neural Information Processing Systems 

The overall effect is to reduce learning rates when the gradient magnitude is large.