AdaLoss: A computationally-efficient and provably convergent adaptive gradient method