Return to search

A compiler for parallel execution of numerical Python programs on graphics processing units

Modern Graphics Processing Units (GPUs) are providing breakthrough performance for numerical computing at the cost of increased programming complexity. Current programming models for GPUs require that the programmer manually manage the data transfer between CPU and GPU. This thesis proposes a simpler programming model and introduces a new compilation framework to enable Python applications containing numerical computations to be executed on GPUs and multi-core CPUs.

The new programming model minimally extends Python to include type and parallel-loop annotations. Our compiler framework then automatically identifies the data to be transferred between the main memory and the GPU for a particular class of affine array accesses. The compiler also automatically performs loop transformations to improve performance on GPUs.

For kernels with regular loop structure and simple memory access patterns, the GPU code generated by the compiler achieves significant performance improvement over multi-core CPU codes.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:AEU.10048/762
Date11 1900
CreatorsGarg, Rahul
ContributorsAmaral, Jose Nelson (Computing Science), Lu, Paul (Computing Science), Cockburn, Bruce (Electrical and Computer Engineering)
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Detected LanguageEnglish
TypeThesis
Format1862977 bytes, application/pdf

Page generated in 0.0031 seconds