Understanding Gradient Descent through the Training Jacobian