Multi-GPU implementation and performance optimization for CSR-based sparse matrix-vector multiplication
Document Type
Conference Proceeding
Publication Date
7-2-2017
Abstract
Sparse matrix-vector multiplication (SpMV) is a critical operation in scientific computing and engineering applications. CSR (Compressed Sparse Row) is the most popular sparse storage format and CSR-Based SpMV usually has good performance on sparse matrices with large number of non-zero elements. This paper presents our Multi-GPU SpMV implementation to improve CSR-Based SpMV performance. We make use of multiple GPUs to jointly complete SpMV computations and adopt streamed approach to increase concurrency to further improve SpMV performance. We evaluate performance of our Multi-GPU SpMV on a collection of fourteen sparse matrices and demonstrate the effectiveness of our proposed approach in performance improvement on a large-scale cluster. The average speedup achieved from our experiments is 6.68.
Publication Title
2017 3rd IEEE International Conference on Computer and Communications, ICCC 2017
First Page Number
2419
Last Page Number
2423
DOI
10.1109/CompComm.2017.8322969
Recommended Citation
Guo, Ping and Zhang, Changjiang, "Multi-GPU implementation and performance optimization for CSR-based sparse matrix-vector multiplication" (2017). Kean Publications. 1600.
https://digitalcommons.kean.edu/keanpublications/1600