Implicit bias of deep linear networks in the large learning rate phase

Open in new window