Global Sparse Momentum SGD for Pruning Very Deep Neural Networks