Appendix A Gradient Descent and Neural Tangent Kernel Gradient Descent Since we consider the square loss and `