Multi-GPU implementation and performance optimization for CSR-based sparse matrix-vector multiplication

Document Type

Conference Proceeding

Publication Date

7-2-2017

Abstract

Sparse matrix-vector multiplication (SpMV) is a critical operation in scientific computing and engineering applications. CSR (Compressed Sparse Row) is the most popular sparse storage format and CSR-Based SpMV usually has good performance on sparse matrices with large number of non-zero elements. This paper presents our Multi-GPU SpMV implementation to improve CSR-Based SpMV performance. We make use of multiple GPUs to jointly complete SpMV computations and adopt streamed approach to increase concurrency to further improve SpMV performance. We evaluate performance of our Multi-GPU SpMV on a collection of fourteen sparse matrices and demonstrate the effectiveness of our proposed approach in performance improvement on a large-scale cluster. The average speedup achieved from our experiments is 6.68.

Publication Title

2017 3rd IEEE International Conference on Computer and Communications, ICCC 2017

First Page Number

2419

Last Page Number

2423

DOI

10.1109/CompComm.2017.8322969

This document is currently not available here.

Share

COinS