On the Variance of the Fisher Information for Deep Learning