Adaptive Learning Rate via Covariance Matrix Based Preconditioning for Deep Neural Networks