A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks

Open in new window