The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

Open in new window