Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification

Open in new window