Current parallelizing compilers cannot identify a significant fraction of parallelizable
loops because they have complex or statically insufficiently defined access patterns.
In our previous work, we have speculatively executed a loop as a doall, and applied a
fully parallel data dependence test to determine if it had any cross–processor depen-
dences. If the test failed, then the loop was re–executed serially. While this method
exploits doall parallelism well, it can cause slowdowns for loops with even one cross-
processor flow dependence because we have to re-execute sequentially. Moreover, the
existing, partial parallelism of loops is not exploited.
We demonstrate a generalization of the speculative doall parallelization tech-
nique, called the Recursive LRPD test, that can extract and exploit the maximum
available parallelism of any loop and that limits potential slowdowns to the over-
head of the run-time dependence test itself. In this thesis, we have presented the
base algorithm and an analysis of the different heuristics for its practical applica-
tion. To reduce the run-time overhead of the Recursive LRPD test, we have im-
plemented on-demand checkpointing and commit, more efficient data dependence
analysis and shadow structures, and feedback-guided load balancing. We obtained
scalable speedups for loops from Track, Spice, and FMA3D that were not paralleliz-
able by previous speculative parallelization methods.
Identifer | oai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-1271 |
Date | 15 May 2009 |
Creators | Dang, Francis Hoai Dinh |
Contributors | Rauchwerger, Lawrence |
Source Sets | Texas A and M University |
Language | en_US |
Detected Language | English |
Type | Book, Thesis, Electronic Thesis, text |
Format | electronic, application/pdf, born digital |
Page generated in 0.0017 seconds