Sparse maximal update parameterization: A holistic approach to sparse training dynamics

Neural Information Processing Systems 

Several challenges make it difficult for sparse neural networks to compete with dense models.