Fast Mixing of Stochastic Gradient Descent with Normalization and Weight Decay