Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum