When do spectral gradient updates help in deep learning?