A Convergence Analysis of Log-Linear Training