The Implicit Bias of Gradient Descent on Separable Multiclass Data

Open in new window