Spelling suggestions: "subject:"parallelizing compiler"" "subject:"parallelizing ompiler""
1 |
A Run-Time Loop Parallelization Technique on Shared-Memory Multiprocessor SystemsWu, Chi-Fan 06 July 2000 (has links)
High performance computing power is important for the current advanced calculations of scientific applications. A multiprocessor system obtains its high performance from the fact that some computations can proceed in parallel. A parallelizing compiler can take a sequential program as input and automatically translate it into parallel form for the target multiprocessor system. But when loops with arrays of irregular, nonlinear or dynamic access patterns, no any current parallelizing compiler can determine whether data dependences exist at compile-time. Thus a run-time parallel algorithm will be utilized to determine dependences and extract the potential parallelism of loops. In this thesis, we propose an efficient run-time parallelization technique to compute a proper parallel execution schedule in those loops. This new method first detects immediate predecessor iterations of each loop iteration and constructs an immediate predecessor table, then efficiently schedules the whole loop iterations into wavefronts for parallel execution. According to either theoretical analysis or experimental results, our new run-time parallelization technique reveals high speedup and low processing overhead. Furthermore, this new technique is appropriate to implement on multiprocessor systems due to the characteristics of high scalability.
|
2 |
A Parallelizing Compiler for FortranJanaki, S 08 1900 (has links)
With the advent of Distributed Memory Machines (DMMs) numerous work have been undertaken to ease the work of a programmer these systems. Data parallel languages like Fortran D, Vienna Fortran, High Performance Fortran and C+ allow the user to specify data distribution across processor with some directives, and the compiler for these language use the directives to compile the programme in to an SPMD code. There are number of old program which are still in use and rewriting them in to new data parallel languages is a costly effort.
Most of the work on these parallelizing compilers concentrate on efficient data communication between the processors.With the advancement in technology, data communication time is also decreasing.This allows bigger programs to execute in the same time span.The resources of a DMM being finite puts a limit on the size of the
problem that can be run. Improving the memory usage for a problem will hence allow us run bigger size problems.
Further, as communication speed increases, the overhead caused by house-keeping computations like global index to local index transformation, and owner processor computation will degrade the performance of the resultant code. Hence a uniform and efficient method for these computations also becomes a necessity.
We have implemented parallelizing parts of a compiler using the SUIF compiler system, which accepts programs written in Fortran77 with directives to the compiler as comments. The output of the compiler is an SPMD C program,
with embedded PVM calls for message communication between the processors.
We have also proposed algorithms to improve data communications,and minimizing memory usage in the output code. A uniform method for performing owner processor computations and global-to-local transformations has also been implemented.
|
3 |
SAGE: An Automatic Analyzing and Parallelizing System to Improve Performance and Reduce Energy on a New High-Performance SoC Architecture¡XProcessor-in-MemoryChu, Slo-Li 04 October 2002 (has links)
Continuous improvements in semiconductor fabrication density are enabling new classes of System-on-a-Chip (SoC) architectures that combine extensive processing logic/processing with high-density memory. Such architectures are generally called Processor-in-Memory or Intelligent Memory and can support high-performance computing by reducing the performance gap between the processor and the memory. This architecture combines various processors in a single system. These processors are characterized by their computational and memory-access capabilities in performance and energy consumption. Two main problems addressed here are how to improve the performance and reduce the energy consumption of applications running on Processor-in-Memory architectures. Accordingly, a novel strategy must be developed to identify the capabilities of the different processors and dispatch the most appropriate jobs to them to exploit them fully. Accordingly, this study proposes a novel automatic source-to-source parallelizing system, called SAGE, to exploit the advantages of Processor-in-Memory architectures. Unlike conventional iteration-based parallelizing systems, SAGE adopts statement-based analytical approaches. The strategy of the SAGE system, which decomposes the original program into blocks and produces a feasible execution schedule for the host and memory processors, is also investigated. Hence, several techniques including statement splitting, weight evaluation, performance scheduling and energy reduction scheduling are designed and integrated into the SAGE system to automatically transform Fortran source programs to improve the performance of the program or reduce energy consumed by the program executed on Processor-in-Memory architecture. This thesis provides detailed techniques and discusses the experimental results of real benchmarks which are transformed by SAGE system and targeted on the Processor-in-Memory architecture.
|
4 |
Automatic Data Partitioning By Hierarchical Genetic SearchShenoy, U Nagaraj 09 1900 (has links)
CDAC / The introduction of languages like High Performance Fortran (HPF) which allow the programmer to indicate how the arrays used in the
program have to be distributed across the local memories of a multi-computer has not completely unburdened the parallel programmer from the intricacies of these architectures. In order to tap the full potential of these architectures, the compiler has to perform this crucial task of data partitioning automatically. This would not only
unburden the programmer but would make the programs more efficient since the compiler can be made more intelligent to take care of the
architectural nuances.
The topic of this thesis namely the automatic data partitioning deals with finding the best data partition for the various arrays used in
the entire program in such a way that the cost of execution of the entire program is minimized. The compiler could resort to runtime redistribution of the arrays at various points in the program if found profitable. Several aspects of this problem have been proven to be NP-complete. Other researchers have suggested heuristic solutions to solve this problem. In this thesis we propose a genetic algorithm namely the Hierarchical Genetic Search algorithm to solve this problem.
|
Page generated in 0.0818 seconds