Return to search

Exploiting parallelism of irregular problems and performance evaluation on heterogeneous multi-core architectures

In this thesis, we design, develop and implement parallel algorithms for irregular
problems on heterogeneous multi-core architectures. Irregular problems exhibit
random and unpredictable memory access patterns, poor spatial locality and input dependent control flow. Heterogeneous multi-core processors vary in: clock frequency, power dissipation, programming model (MIMD vs. SIMD), memory design and computing units, scalar versus vector units. The heterogeneity of the processors makes designing efficient parallel algorithms for irregular problems on heterogeneous multicore processors challenging. Techniques of mapping tasks or data on traditional parallel computers can not be used as is on heterogeneous multi-core processors due to the varying hardware. In an attempt to understand the efficiency of futuristic heterogeneous multi-core architectures on applications we study several computation and bandwidth oriented irregular problems on one heterogeneous multi-core architecture, the IBM Cell Broadband Engine (Cell BE). The Cell BE consists of a general processor and eight specialized processors and addresses vector/data-level parallelism and instruction-level parallelism simultaneously. Through these studies on the Cell BE, we provide some discussions and insight on the performance of the applications on heterogeneous multi-core architectures.

Verifying these experimental results require some performance modeling. Due
to the diversity of heterogeneous multi-core architectures, theoretical performance models used for homogeneous multi-core architectures do not provide accurate results. Therefore, in this thesis we propose an analytical performance prediction model that considers the multitude architectural features of heterogeneous multi-cores (such as DMA transfers, number of instructions and operations, the processor frequency and DMA bandwidth). We show that the execution time from our prediction model is comparable to the execution time of the experimental results for a complex medical imaging application.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:MWU.1993/9236
Date04 October 2012
CreatorsXu, Meilian
ContributorsThulasiraman, Parimala (Computer Science), Li, Ben (Computer Science) Annakkage, Udaya (Electrical and Computer Engineering) Yang, Laurence (Computer Science, St. Francis Xavier University)
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Detected LanguageEnglish

Page generated in 0.0023 seconds