The Implicit Bias of Gradient Descent on Separable Data

Open in new window