This thesis is composed of two parts, that relate to both parallel and heterogeneous processing.
The first describes DistCL, a distributed OpenCL framework that allows a cluster of GPUs to be programmed like a single device.
It uses programmer-supplied meta-functions that associate work-items to memory.
DistCL achieves speedups of up to 29x using 32 peers.
By comparing DistCL to SnuCL, we determine that the compute-to-transfer ratio of a benchmark is the best predictor of its performance scaling when distributed.
The second is a statistical power model for the AMD Fusion heterogeneous processor.
We present a systematic methodology to create a representative set of compute micro-benchmarks using data collected from real hardware.
The power model is created with data from both micro-benchmarks and application benchmarks.
The model showed an average predictive error of 6.9% on heterogeneous workloads.
The Multi2Sim heterogeneous simulator was modified to support configurable power modelling.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OTU.1807/42818 |
Date | 22 November 2013 |
Creators | Diop, Tahir |
Contributors | Anderson, Jason, Enright Jerger, Natalie |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Page generated in 0.0016 seconds