Stochastic Normalized Gradient Descent with Momentum for Large Batch Training