Escaping mediocrity: how two-layer networks learn hard single-index models with SGD

Open in new window