• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Study on the Generation of Local Memory Access Sequences and Communication Sets for Data-Parallel Programs

Shiu, Liang-Cheng 13 February 2003 (has links)
Distributed-memory multiprocessors offer very high levels of performance that are required to solve scientific applications. A traditional programming language cannot be expected to yield good performance when used to program such machines. Data-parallel languages provide programmers with a global memory and relieve them from the burden of inserting time-consuming, error-prone inter-processor communication. The compilers of these languages perform this task. Data-parallel languages also enable the programmers to establish alignment and distribution directives which specify the type of data parallelism and data mapping to the underlying parallel architecture. Parallelizing compilers distribute data and generate code according to the owner-computes rule when compiling an array statement. The array elements in a processor it owns are only a fraction of all the array elements. Not all of the array elements in the processor are active elements, so determining local memory access sequence is important. However, generating local memory access sequences becomes rather complicated when the array references involve complex subscripts. This study considers two types of complex subscript ― coupled subscripts and multiple induction variables. A processor may refer to the rhs (right-hand side) array elements owned by other processors, and the movement of data is inevitable. The overhead to access non-local data by inter-processor communication may be around 10 to 100 times more than the cost of accessing local data. Efficiently generating communication sets is important. This thesis introduces the concept of block compression/decompression, using smaller iteration tables, course distance and local block distance to solve problems of local memory access sequences, coupled scripts, MIV subscripts and communication set generation. Related work on these problems is reviewed and experimental results to demonstrate the benefit of the proposed methods.

Page generated in 0.059 seconds