Grams: Gradient Descent with Adaptive Momentum Scaling