On Convergence of Training Loss Without Reaching Stationary Points