A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks