SUBP: Soft Uniform Block Pruning for 1 \times N Sparse CNNs Multithreading Acceleration
–Neural Information Processing Systems
The study of sparsity in Convolutional Neural Networks (CNNs) has become widespread to compress and accelerate models in environments with limited resources. By constraining N consecutive weights along the output channel to be group-wise non-zero, the recent network with 1$\times$N sparsity has received tremendous popularity for its three outstanding advantages: 1) A large amount of storage space saving by a \emph{Block Sparse Row} matrix.
Neural Information Processing Systems
Dec-26-2025, 11:51:36 GMT
- Technology: