Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent

Open in new window