Return to search

Generic implementations of parallel prefix sums and its applications

Parallel prefix sums algorithms are one of the simplest and most useful building
blocks for constructing parallel algorithms. A generic implementation is valuable
because of the wide range of applications for this method.
This thesis presents a generic C++ implementation of parallel prefix sums. The
implementation applies two separate parallel prefix sums algorithms: a recursive
doubling (RD) algorithm and a binary-tree based (BT) algorithm.
This implementation shows how common communication patterns can be separated
from the concrete parallel prefix sums algorithms and thus simplify the work
of parallel programming. For each algorithm, the implementation uses two different
synchronization options: barrier synchronization and point-to-point synchronization.
These synchronization options lead to different communication patterns in the algorithms,
which are represented by dependency graphs between tasks.
The performance results show that point-to-point synchronization performs better
than barrier synchronization as the number of processors increases.
As part of the applications for parallel prefix sums, parallel radix sort and four
parallel tree applications are built on top of the implementation. These applications
are also fundamental parallel algorithms and they represent typical usage of parallel
prefix sums in numeric computation and graph applications. The building of such
applications become straighforward given this generic implementation of parallel
prefix sums.

Identiferoai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-1272
Date15 May 2009
CreatorsHuang, Tao
ContributorsRauchwerger, Lawrence
Source SetsTexas A and M University
Languageen_US
Detected LanguageEnglish
TypeBook, Thesis, Electronic Thesis, text
Formatelectronic, application/pdf, born digital

Page generated in 0.0016 seconds