TETRIS: TilE-matching the TRemendous Irregular Sparsity
Yu Ji, Ling Liang, Lei Deng, Youyang Zhang, Youhui Zhang, Yuan Xie
–Neural Information Processing Systems
Compressing neural networks by pruning weights with small magnitudes can significantly reduce the computation and storage cost. Although pruning makes the model smaller, it is difficult to get a practical speedup in modern computing platforms such as CPU and GPU due to the irregularity. Structural pruning has attracted a lot of research interest to make sparsity hardware-friendly. Increasing the sparsity granularity can lead to better hardware utilization, but it will compromise the sparsity for maintaining accuracy. In this work, we propose a novel method, TETRIS, to achieve both better hardware utilization and higher sparsity.
Neural Information Processing Systems
Oct-7-2024, 17:01:43 GMT
- Country:
- North America > United States (0.28)
- Genre:
- Research Report (0.66)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.70)
- Technology: