Return to search

Acceleration of Block-Aware Matrix Factorization on Heterogeneous Platforms

Block-structured matrices arise in several contexts in circuit
simulation problems. These matrices typically inherit the pattern of
sparsity from the circuit connectivity. However, they are also
characterized by dense spots or blocks. Direct factorization of those
matrices has emerged as an attractive approach if the host memory is sufficiently large to store the block-structured matrix. The approach proposed in this thesis aims to accelerate the direct factorization of general block-structured matrices by leveraging the power of multiple OpenCL accelerators such as Graphical Processing Units (GPUs).

The proposed approach utilizes the notion of a Directed Acyclic Graph representing the matrix in order to schedule its factorization on multiple accelerators. This thesis also describes memory management techniques that enable handling large matrices while minimizing the amount of memory transfer over the PCIe bus between the host CPU and the attached devices. The results demonstrate that by using two GPUs the proposed approach can achieve a nearly optimal speedup when compared to a
single GPU platform.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/35128
Date January 2016
CreatorsSomers, Gregory W.
ContributorsGad, Emad, Bolic, Miodrag
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.002 seconds