On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks