Return to search

Optimizing Sparse Matrix-Matrix Multiplication on a Heterogeneous CPU-GPU Platform

Sparse Matrix-Matrix multiplication (SpMM) is a fundamental operation over irregular data, which is widely used in graph algorithms, such as finding minimum spanning trees and shortest paths. In this work, we present a hybrid CPU and GPU-based parallel SpMM algorithm to improve the performance of SpMM. First, we improve data locality by element-wise multiplication. Second, we utilize the ordered property of row indices for partial sorting instead of full sorting of all triples according to row and column indices. Finally, through a hybrid CPU-GPU approach using two level pipelining technique, our algorithm is able to better exploit a heterogeneous system. Compared with the state-of-the-art SpMM methods in cuSPARSE and CUSP libraries, our approach achieves an average of 1.6x and 2.9x speedup separately on the nine representative matrices from University of Florida sparse matrix collection.

Identiferoai:union.ndltd.org:GEORGIA/oai:scholarworks.gsu.edu:cs_theses-1085
Date16 December 2015
CreatorsWu, Xiaolong
PublisherScholarWorks @ Georgia State University
Source SetsGeorgia State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceComputer Science Theses

Page generated in 0.0018 seconds