How does Gradient Descent Learn Features - A Local Analysis for Regularized Two-Layer Neural Networks Mo Zhou

Neural Information Processing Systems 

Feature learning has long been considered to be a major advantage of neural networks.