DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Neural Information Processing Systems 

Neural pruning is a widely-used compression technique for Deep Neural Networks (DNNs).