Accelerator architectures are used to accelerate the
simulation of nonlinear hyperbolic PDEs. Three different architectures, a multicore
CPU using threading, IBM’s Cell Processor, and Nvidia’s Tesla GPUs are investigated. Speed-ups of between 40-75× relative to a single CPU core in single precision are obtained using the Cell processor and the GPU. The three implementations are extended to parallel computing clusters by making use
of the Message Passing Interface (MPI). The resulting hybrid-parallel code is investigated
for performance and scalability on both a GPU and Cell computing cluster.
Identifer | oai:union.ndltd.org:WATERLOO/oai:uwspace.uwaterloo.ca:10012/4518 |
Date | 15 July 2009 |
Creators | Rostrup, Scott |
Source Sets | University of Waterloo Electronic Theses Repository |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Page generated in 0.0018 seconds