A Nonlinear Hash-based Optimization Method for SpMV on GPUs
Yan, Chen, Diao, Boyu, Liu, Hangda, An, Zhulin, Xu, Yongjun
–arXiv.org Artificial Intelligence
A Nonlinear Hash-based Optimization Method for SpMV on GPUs Chen Y an a,b, Boyu Diao a,b, Hangda Liu a,b, Zhulin An a,b and Y ongjun Xu a,b a Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China b University of Chinese Academy of Sciences, Beijing, China {yanchen23s, diaoboyu2012, liuhangda21s, anzhulin, xyj } @ict.ac.cn Abstract --Sparse matrix-vector multiplication (SpMV) is a fundamental operation with a wide range of applications in scientific computing and artificial intelligence. However, the large scale and sparsity of sparse matrix often make it a performance bottleneck. In this paper, we highlight the effectiveness of hash-based techniques in optimizing sparse matrix reordering, introducing the Hash-based Partition (HBP) format, a lightweight SpMV approach. HBP retains the performance benefits of the 2D-partitioning method while leveraging the hash transformation's ability to group similar elements, thereby accelerating the pre-processing phase of sparse matrix reordering. Additionally, we achieve parallel load balancing across matrix blocks through a competitive method. Our experiments, conducted on both Nvidia Jetson AGX Orin and Nvidia RTX 4090, show that in the pre-processing step, our method offers an average speedup of 3.53 times compared to the sorting approach and 3.67 times compared to the dynamic programming method employed in Regu2D. Furthermore, in SpMV, our method achieves a maximum speedup of 3.32 times on Orin and 3.01 times on RTX4090 against the CSR format in sparse matrices from the University of Florida Sparse Matrix Collection. I NTRODUCTION Sparse matrix-vector multiplication (SpMV) has a wide range of applications, such as mathematical solutions for sparse linear equations [13], iterative algorithm-solving processing [15] [25], graph processing [9] [14] [24], and weight calculations for forward and backward propagation in neural networks [3] [12] [17] [19], etc. However, SpMV is actually the bottleneck for many algorithms. The sparse matrix used in SpMV has the following characteristics [4]: (1) Sparsity. On the one hand, sparse matrices contain a large number of zero elements.
arXiv.org Artificial Intelligence
Apr-15-2025
- Country:
- Asia
- North America
- Canada
- British Columbia > Vancouver (0.04)
- Ontario > Toronto (0.04)
- United States
- Florida > Pinellas County
- St. Petersburg (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Massachusetts
- Middlesex County > Waltham (0.04)
- Suffolk County > Boston (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Florida > Pinellas County
- Canada
- Genre:
- Research Report (0.40)
- Industry:
- Information Technology > Hardware (0.56)
- Technology: