Return to search

Towards Enhancing Performance, Programmability, and Portability in Heterogeneous Computing

The proliferation of a diverse set of heterogeneous computing platforms in conjunction with the plethora of programming languages and optimization techniques on each language for each underlying architecture exacerbate widespread adoption of such platforms. This is especially true for novice programmers and the non-technical-savvy masses that are largely precluded from enjoying the advantages of high-performance computing. Moreover, different groups within the heterogeneous computing community (e.g., hardware architects, tool developers, and programmers) are presented with new challenges with respect to performance, programmability, and portability (or the three P's) of heterogeneous computing.

In this work we discuss such challenges and identify benchmarking techniques based on computation and communication patterns as an appropriate means for the systematic evaluation of heterogeneous computing with respect to the three P's. Our proposed approach is based on OpenCL implementations of the Berkeley dwarfs. We use our benchmark suite (OpenDwarfs) in characterizing performance of state-of-the-art parallel architectures, and as the main component of a methodology (Telescoping Architectures) for identifying trends in future heterogeneous architectures. Furthermore, we employ OpenDwarfs in a multi-faceted study on the gaps between the three P's in the context of the modern heterogeneous computing landscape. Our case-study spans a variety of compilers, languages, optimizations, and target architectures, including the CPU, GPU, MIC, and FPGA. Based on our insights, and extending aspects of prior research (e.g., in compilers, programming languages, and auto-tuning), we propose the introduction of grid-based data structures as the basis of programming frameworks and present a prototype unified framework (GLAF) that encompasses a novel visual programming environment with code generation, auto-parallelization, and auto-tuning capabilities. Our results, which span scientific domains, indicate that our holistic approach constitutes a viable alternative towards enhancing the three P's and further democratizing heterogeneous, parallel computing for non-programming-savvy audiences, and especially domain scientists. / Ph. D. / In the past decade computing has moved from <i>single-core</i> machines, that is machines with a CPU that can execute code in a serial manner, to <i>multi-core</i> ones, i.e., machines with CPUs that can execute code in a parallel fashion. Another paradigm shift that has manifested in the past years entails computing that utilizes <i>heterogeneous processing</i>, as opposed to <i>homogeneous processing</i>. In the latter case a single type of processor (CPU) is responsible for executing a given program, whereas in the former case different types of processors (such as CPUs, graphics processors or other accelerators) collaborate in an effort to tackle computationally difficult problems in a fast, parallel manner.

The shift to <i>multi-core, parallel, heterogeneous computing</i> described above is accompanied by an associated shift in programming languages for such platforms, as well as techniques to optimize programs for high performance (i.e., execution speed). The unique complexities of parallel and heterogeneous computing exacerbate widespread adoption of such platforms. This is especially true for novice programmers and the non-technical-savvy masses that are largely precluded from the advantages of high-performance computing. Challenges include obtaining fast execution speeds (i.e., <i>performance<i>), easiness of programming (i.e., <i>programmability</i>), and the ability to execute programs across different heterogeneous platforms (i.e., <i>portability</i>). Performance, programmability, and portability constitute the <i>3 P’s of heterogeneous computing</i>.

In this work we discuss the above challenges in detail and provide insights and solutions for different interest groups within the computing community, such as computer architects, tool developers and programmers. We propose an approach for evaluating existing heterogeneous computing platforms based on the concept of <i>dwarf-based benchmarks</i> (i.e., applications that are characterized by certain computation and communication patterns). Furthermore, we propose a methodology for utilizing the dwarf concept for evaluating potential future heterogeneous platforms. In our research we attempt to quantify the trade-offs between performance, programmability, and portability in a wide set of modern heterogeneous platforms. Based on the above, we seek to bridge the 3 P’s by introducing a programming framework that democratizes parallel algorithm development on heterogeneous architectures for novice programmers and domain scientists. Specifically, our framework produces parallel, optimized code implementations in multiple languages with the potential of executing across different heterogeneous platforms.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/77582
Date03 May 2017
CreatorsKrommydas, Konstantinos
ContributorsComputer Science, Feng, Wu-chun, Sasanka, Ruchira, Tilevich, Eli, Butt, Ali R., Cao, Yong
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0023 seconds