Spelling suggestions: "subject:"arallel programs (computer programs)"" "subject:"arallel programs (coomputer programs)""
1 |
iC2mpi a platform for parallel execution of graph-structured iterative computations /Botadra, Harnish. January 2006 (has links)
Thesis (M.S.)--Georgia State University, 2006. / Title from title screen. Sushil Prasad, committee chair. Electronic text (106 p. : charts) : digital, PDF file. Description based on contents viewed June 11, 2007. Includes bibliographical references. Includes bibliographical references (p. 61-53).
|
2 |
Productivity with performance: property/behavior-based automated composition of parallel programs from self-describing components / Property/behavior-based automated composition of parallel programs from self-describing componentsMahmood, Nasim, 1976- 28 August 2008 (has links)
Development of efficient and correct parallel programs is a complex task. These parallel codes have strong requirements for performance and correctness and must operate robustly and efficiently across a wide spectrum of application parameters and on a wide spectrum of execution environments. Scientific and engineering programs increasingly use adaptive algorithms whose behavior can change dramatically at runtime. Performance properties are often not known until programs are tested and performance may degrade during execution. Many errors in parallel programs arise in incorrect programming of interactions and synchronizations. Testing has proven to be inadequate. Formal proofs of correctness are needed. This research is based on systematic application of software engineering methods to effective development of efficiently executing families of high performance parallel programs. We have developed a framework (P-COM²) for development of parallel program families which addresses many of the problems cited above. The conceptual innovations underlying P-COM² are a software architecture specification language based on self-describing components, a timing and sequencing algorithm which enables execution of programs with both concrete and abstract components and a formal semantics for the architecture specification language. The description of each component incorporates compiler-useable specifications for the properties and behaviors of the components, the functionality a component implements, pre-conditions and postconditions on the inputs and outputs and state machine based sequencing control for invocations of the component. The P-COM² compiler and runtime system implement these concepts to enable: (a) evolutionary development where a program instance is evolved from a performance model to a complete application with performance known at each step of evolution, (b) automated composition of program instances targeting specific application instances and/or execution environments from self-describing components including generation of all parallel structuring, (c) runtime adaptation of programs on a component by component basis, (d) runtime validation of pre-and post-conditions and sequencing of interactions and (e) formal proofs of correctness for interactions among components based on model checking of the interaction and synchronization properties of the program. The concepts and their integration are defined, the implementation is described and the capabilities of the system are illustrated through several examples.
|
3 |
Productivity with performance property/behavior-based automated composition of parallel programs from self-describing components /Mahmood, Nasim, January 1900 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2007. / Vita. Includes bibliographical references.
|
4 |
Structured arrows : a type-based framework for structured parallelismCastro, David January 2018 (has links)
This thesis deals with the important problem of parallelising sequential code. Despite the importance of parallelism in modern computing, writing parallel software still relies on many low-level and often error-prone approaches. These low-level approaches can lead to serious execution problems such as deadlocks and race conditions. Due to the non-deterministic behaviour of most parallel programs, testing parallel software can be both tedious and time-consuming. A way of providing guarantees of correctness for parallel programs would therefore provide significant benefit. Moreover, even if we ignore the problem of correctness, achieving good speedups is not straightforward, since this generally involves rewriting a program to consider a (possibly large) number of alternative parallelisations. This thesis argues that new languages and frameworks are needed. These language and frameworks must not only support high-level parallel programming constructs, but must also provide predictable cost models for these parallel constructs. Moreover, they need to be built around solid, well-understood theories that ensure that: (a) changes to the source code will not change the functional behaviour of a program, and (b) the speedup obtained by doing the necessary changes is predictable. Algorithmic skeletons are parametric implementations of common patterns of parallelism that provide good abstractions for creating new high-level languages, and also support frameworks for parallel computing that satisfy the correctness and predictability requirements that we require. This thesis presents a new type-based framework, based on the connection between structured parallelism and structured patterns of recursion, that provides parallel structures as type abstractions that can be used to statically parallelise a program. Specifically, this thesis exploits hylomorphisms as a single, unifying construct to represent the functional behaviour of parallel programs, and to perform correct code rewritings between alternative parallel implementations, represented as algorithmic skeletons. This thesis also defines a mechanism for deriving cost models for parallel constructs from a queue-based operational semantics. In this way, we can provide strong static guarantees about the correctness of a parallel program, while simultaneously achieving predictable speedups.
|
5 |
A distributed reconstruction of EKG signalsCordova, Gabriel, January 2008 (has links)
Thesis (M.S.)--University of Texas at El Paso, 2008. / Title from title screen. Vita. CD-ROM. Includes bibliographical references. Also available online.
|
6 |
Real time cloth modeling using parallel computing /Luo, Zegang. January 2005 (has links)
Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2005. / Includes bibliographical references (leaves 112-123). Also available in electronic version.
|
7 |
A trusted environment for MPI programsFlorez-Larrahondo, German, January 2002 (has links)
Thesis (M.S.)--Mississippi State University. Department of Computer Science. / Title from title screen. Includes bibliographical references.
|
8 |
Analysis of a parallelized neural network training program implemented using MPI and RPCsCordova, Hector. January 2008 (has links)
Thesis (M.S.)--University of Texas at El Paso, 2008. / Title from title screen. Vita. CD-ROM. Includes bibliographical references. Also available online.
|
9 |
Iteratively defined transfinite trace semantics and program slicing with respect to them /Nestra, Härmel, January 2006 (has links) (PDF)
Thesis (doctoral)--University of Tartu, 2006. / Includes bibliographical references.
|
10 |
Dynamic program analysis algorithms to assist parallelizationKim, Minjang 24 August 2012 (has links)
All market-leading processor vendors have started to pursue multicore processors as an alternative to high-frequency single-core processors for better energy and power efficiency. This transition to multicore processors no longer provides the free performance gain enabled by increased clock frequency for programmers. Parallelization of existing serial programs has become the most powerful approach to improving application performance. Not surprisingly, parallel programming is still extremely difficult for many programmers mainly because thinking in parallel is simply beyond the human perception. However, we believe that software tools based on advanced analyses can significantly reduce this parallelization burden.
Much active research and many tools exist for already parallelized programs such as finding concurrency bugs. Instead we focus on program analysis algorithms that assist the actual parallelization steps: (1) finding parallelization candidates, (2) understanding the parallelizability and profits of the candidates, and (3) writing parallel code. A few commercial tools are introduced for these steps. A number of researchers have proposed various methodologies and techniques to assist parallelization. However, many weaknesses and limitations still exist.
In order to assist the parallelization steps more effectively and efficiently, this dissertation proposes Prospector, which consists of several new and enhanced program analysis algorithms.
First, an efficient loop profiling algorithm is implemented. Frequently executed loop can be candidates for profitable parallelization targets. The detailed execution profiling for loops provides a guide for selecting initial parallelization targets.
Second, an efficient and rich data-dependence profiling algorithm is presented. Data dependence is the most essential factor that determines parallelizability. Prospector exploits dynamic data-dependence profiling, which is an alternative and complementary approach to traditional static-only analyses. However, even state-of-the-art dynamic dependence analysis algorithms can only successfully profile a program with a small memory footprint. Prospector introduces an efficient data-dependence profiling algorithm to support large programs and inputs as well as provides highly detailed profiling information.
Third, a new speedup prediction algorithm is proposed. Although the loop profiling can give a qualitative estimate of the expected profit, obtaining accurate speedup estimates needs more sophisticated analysis. Prospector introduces a new dynamic emulation method to predict parallel speedups from annotated serial code. Prospector also provides a memory performance model to predict speedup saturation due to increased memory traffic. Compared to the latest related work, Prospector significantly improves both prediction accuracy and coverage.
Finally, Prospector provides algorithms that extract hidden parallelism and advice on writing parallel code. We present a number of case studies how Prospector assists manual parallelization in particular cases including privatization, reduction, mutex, and pipelining.
|
Page generated in 0.0931 seconds