Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

Neural Information Processing Systems 

Empirically, we find that a variety of neural networks successfully learn sparse parities, with discontinuous phase transitions in the training curves.