Convergence Rates of Training Deep Neural Networks via Alternating Minimization Methods