Spelling suggestions: "subject:"arallel programming"" "subject:"aparallel programming""
91 |
Detecting complex genetic mutations in large human genome dataAlsulaiman, Thamer 01 August 2019 (has links)
All cellular forms of life contain Deoxyribonucleic acid (DNA). DNA is a molecule that carries all the information necessary to perform both, basic and complex cellular functions. DNA is replicated to form new tissue/organs, and to pass genetic information to future generations. DNA replication ideally yield an exact copy of the original DNA. While replication generally occurs without error, it may leave DNA vulnerable to accidental changes via mistakes made during the replication process. Those changes are called mutations. Mutations range in magnitude. Yet, mutations of any magnitude range in consequences, from no effect on the organism, to disease initiation (e.g. cancer), or even death.
In this thesis, we limit our focus to mutations in human DNA, and in particular, MMBIR mutations. Recent literature in human genomics has found Microhomology-mediated break-induced replication (MMBIR) to be a common mechanism producing complex mutations in DNA. MMBIRFinder is a tool to detect MMBIR regions in Yeast DNA. Although MMBIRFinder is successful on Yeast DNA, MMBIRFinder is not capable of detecting MMBIR mutations in human DNA. Among several reasons, one major reason for its deficiency with human DNA is the amount of computations required to process human large data. Our contribution in this regard is two fold:
1) We utilize parallel computations to significantly reduce the processing time consumed by the original MMBIFinder, and address several performance degrading issues inherent in the original design;
2) We introduce a new heuristic to detect MMBIR mutations that were not detected by the original MMBIRFinder, even in the case of small sized DNA, like Yeast DNA.
|
92 |
Data-parallel programming with multiple inheritance on the connection machineGirimaji, Sanjay 01 April 1990 (has links)
The demand for computers is oriented toward faster computers and newer computers are being built with more than one CPU. These computers require sophisticated software to program them. One such approach to program the multiple CPU machines is through the use of object-oriented programming techniques. An example of such an approach is the use of C* on the Connection Machine.
Though C* supports many of the object-oriented concepts, it does not support the concept of software reuse through inheritance. This thesis introduces a new language called C*±+ , an extension of C* language to support inheritance. We also discuss the issues invloved in the implementation of multiple inheritance in programming languages.
This thesis describes the differences between C** and C* . It also discusses the various issues involved in the design and implementation of the translator from C** to C* . It also illustrates the advantages of programming in C*++ through an example. Since C*++ is designed to support software reuse which allows the users to create quality software in shorter time, it is anticipated that C*+ will have widespread use in programming the Connection Machine.
|
93 |
Distributed parallel computation using standard MLChattopadhyay, Vaishali, January 2007 (has links) (PDF)
Thesis (M.S. in computer science)--Washington State University, December 2007. / Includes bibliographical references (p. 97-102).
|
94 |
A Skeleton library for Cell Broadband Engine / Ett Skelettbibliotek för Cell Broadband EngineÅlind, Markus January 2008 (has links)
<p>The Cell Broadband Engine processor is a powerful processor capable of over 220 GFLOPS. It is highly specialized and can be controlled in detail by the programmer. The Cell is significantly more complicated to program than a standard homogeneous multi core processor such as the Intel Core2 Duo and Quad. This thesis explores the possibility to abstract some of the complexities of Cell programming while maintaining high performance. The abstraction is achieved through a library of parallel skeletons implemented in the bulk synchronous parallel programming environment NestStep. The library includes constructs for user defined SIMD optimized data parallel skeletons such as map, reduce and more. The evaluation of the library includes porting of a vector based scientific computation program from sequential C code to the Cell using the library and the NestStep environment. The ported program shows good performance when compared to the sequential original code run on a high-end x86 processor. The evaluation also shows that a dot product implemented with the skeleton library is faster than the dot product in the IBM BLAS library for the Cell processor with more than two slave processors.</p><p> </p>
|
95 |
The limits of network transparency in a distributed programming languageCollet, Raphaël 19 December 2007 (has links)
This dissertation presents a study on the extent and limits of network transparency in distributed programming languages. This property states that the result of a distributed program is the same as if it were executed on a single computer, in the case when no failure occurs. The programming language may also be network aware if it allows the programmer to control how a program is distributed and how it behaves on the network. Both aim at simplifying distributed programming, by making non-functional aspects of a program more modular.
We show that network transparency is not only possible, but also practical: it can be efficient, and smoothly extended in the case of partial failure. We give a proof of concept with the programming language Oz and the system Mozart, of which we have reimplemented the distribution support on top of the Distribution Subsystem (DSS). We have extended the language to control which distribution algorithms are used in a program, and reflect partial failures in the language. Both extensions allow to handle non-functional aspects of a program without breaking the property of network transparency.
|
96 |
Design and performance analysis of MPI-SHARC a high-speed network service for distributed digital signal processor systems /Kohout, James, January 2001 (has links) (PDF)
Thesis (M.S.)--University of Florida, 2001. / Title from first page of PDF file. Document formatted into pages; contains ix, 69 p.; also contains graphics. Vita. Includes bibliographical references (p. 66-68).
|
97 |
Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing UnitsAsgari Kamiabad, Amirhassan 26 May 2011 (has links)
The research conducted in this thesis provides a robust implementation of a preconditioned iterative linear solver on programmable graphic processing units (GPUs). Solving a large, sparse linear system is the most computationally demanding part of many widely used power system analysis. This thesis presents a detailed study of iterative linear solvers with a focus on Krylov-based methods. Since the ill-conditioned nature of power system matrices typically requires substantial preconditioning to ensure robustness of Krylov-based methods, a polynomial preconditioning technique is also studied in this thesis. Implementation of the Chebyshev polynomial preconditioner and biconjugate gradient solver on a programmable GPU are presented and discussed in detail. Evaluation of the performance of the GPU-based preconditioner and linear solver on a variety of sparse matrices shows significant computational savings relative to a CPU-based implementation of the same preconditioner and commonly used direct methods.
|
98 |
Implementing a Preconditioned Iterative Linear Solver Using Massively Parallel Graphics Processing UnitsAsgari Kamiabad, Amirhassan 26 May 2011 (has links)
The research conducted in this thesis provides a robust implementation of a preconditioned iterative linear solver on programmable graphic processing units (GPUs). Solving a large, sparse linear system is the most computationally demanding part of many widely used power system analysis. This thesis presents a detailed study of iterative linear solvers with a focus on Krylov-based methods. Since the ill-conditioned nature of power system matrices typically requires substantial preconditioning to ensure robustness of Krylov-based methods, a polynomial preconditioning technique is also studied in this thesis. Implementation of the Chebyshev polynomial preconditioner and biconjugate gradient solver on a programmable GPU are presented and discussed in detail. Evaluation of the performance of the GPU-based preconditioner and linear solver on a variety of sparse matrices shows significant computational savings relative to a CPU-based implementation of the same preconditioner and commonly used direct methods.
|
99 |
HW-SW components for parallel embedded computing on Noc-based MPSoCsJoven Murillo, Jaume 15 March 2010 (has links)
Recentment, en el camp del sistemes encastats, estem assistint al creixement de sistemes Multi-Processor System-on-Chip (MPSoC). El paradigma de Network-on-chip (NoC) s'ha proposat una solució viable, eficient, escalable, predictible i flexible per connectar components dins un xip, o inclús sistemes complets basats en busos dins al xip amb la finalitat de crear sistemes altament complexos. Així, el paradigma de computació encastada d'altres prestacions està arribant a través d'integrar hardware altament paral·lel amb llibreries software per obtenir una màxima integració a nivell de plataforma utilitzant de components prèviament dissenyats (IP cores), en la forma de arquitectures NoC-based MPSoCs. No obstant, quan el nombre de components augmenta hi ha diversos desafiaments i problemes a resoldre. El primer repte és el disseny d'una xarxa d'interconnexió que proporcioni qualitat de servei assegurant un cert ample de banda i latència entre cada bloc del sistema, amb el mínim area i consum possible. Ja que l'espai de disseny en arquitectures NoCs és enorme, s'han de desenvolupar entorns de simulació, i verificació per explorar validar i optimitzar múltiples NoC arquitectures. El segon objectiu, que és actualment un forat de recerca, és proveir models de programació paral·lela flexibles i eficients sobre les arquitectures NoC-based MPSoCs. Així, és obligatori l'ús de llibreries software lleugeres capaces d'explotar la capacitats del hardware present a la plataforma d'execució. Fent servir aquestes llibreries software permetrà els programadors reutilitzar i programar de manera fàcil aplicacions paral·leles dins un xip. Finalment, per obtenir un sistema eficient, un punt clau és el disseny de les interfícies HW-SW apropiades. Aquest fet és crucial in multi processadors heterogenis on els paradigmes de programació paral·lela and middleware han d'abstreure els recursos de comunicació durant l'especificació d'aplicacions software. El principal objectiu d'aquesta tesis és enriquir les emergents arquitectures NoC-based MPSoC explorant i fent contribucions de caire científic afrontant els nous reptes apareguts aquest últims anys. Aquesta tesis es focalitza en els següents temes: Descripció of un entorn experimental anomenat NoCMaker per realitzar exploració arquitectural de sistemes NoC-based MPSoC, permetent alhora una validació i prototipatge ràpid. Extensió de les interfícies de xarxa per controlar tràfic heterogeni de diferents estàndards (AMBA AHB, OCP-IP) amb la finalitat de reutilitzar i comunicar de manera transparent múltiple IP cores des del punt de vista de l'usuari. Proporcionar qualitat de servei en temps d'execució a traves de components hardware a la NoC, i de rutines middleware en software. Exploració de les interfícies HW-SW i la compartició de recursos quan una unitat de punt flotant es connecta com a coprocessador a un sistema NoC-based MPSoC. Migració de paradigmes de programació paral·lela, com memòria compartida i pas de missatges en arquitectures NoC-based MPSoCs. En aquesta tesis presentem el desenvolupament d'un model de programació paral·lela basat en pas de missatges (MPI), anomenat on-chip MPI. Això permet el disseny de programes paral·leles distribuïts a nivell de tasca o funció fent servir la programació paral·lela explicita amb els mètodes de sincronia entre els elements integrats en el xip. Proporcionant qualitat de servei en temps d'execució a sobre d'una llibreria OpenMP dissenyada per sistemes de memòria compartida amb la finalitat d'accelerar o balancejar aplicacions critiques i fils d'execució durant la seva execució. Tots els reptes explorats durant aquesta tesi doctoral estan formalitzats en una metodologia hardware-software centrada en la infraestructura de comunicació de la plataforma. Així, el resultat d'aquest treball d'investigació serà una plataforma cluster-on-chip per una computació paral·lela encastada d'altes prestacions, on els components hardware and software poden ser reutilitzats a diverses nivells d'abstracció. / Recently, on the on-chip and embedded domain, we are witnessing the growing of the Multi-Processor System-on-Chip (MPSoC) era. Network-on-chip (NoCs) have been proposed to be a viable, efficient, scalable, predictable and flexible solution to interconnect IP blocks on a chip, or full-featured bus-based systems in order to create highly complex systems. Thus, the paradigm to high-performance embedded computing is arriving through high hardware parallelism and concurrent software stacks to achieve maximum system platform composability and flexibility using pre-designed IP cores. These are the emerging NoC-based MPSoCs architectures. However, as the number of IP cores on a single chip increases exponentially, many new challenges arise. The first challenge is the design of a suitable hardware interconnection to provide adequate Quality of Service (QoS) ensuring certain bandwidth and latency bounds for inter-block communication, but at a minimal power and area costs. Due to the huge NoC design space, simulation and verification environments must be put in place to explore, validate and optimize many different NoC architectures. The second target, nowadays a hot topic, is to provide efficient and flexible parallel programming models upon new generation of highly parallel NoC-based MPSoCs. Thus, it is mandatory the use of lightweight SW libraries which are able to exploit hardware features present on the execution platform. Using these software stacks and their associated APIs according to a specific parallel programming model will let software application designers to reuse and program parallel applications effortlessly at higher levels of abstraction. Finally, to get an efficient overall system behaviour, a key research challenge is the design of suitable HW/SW interfaces. Specially, it is crucial in heterogeneous multiprocessor systems where parallel programming models and middleware functions must abstract the communication resources during high level specification of software applications. Thus, the main goal of this dissertation is to enrich the emerging NoC-based MPSoCs by exploring and adding engineering and scientific contribution to new challenges appeared in the last years. This dissertation focuses on all of the above points: by describing an experimental environment to design NoC-based systems, xENoC, and a NoC design space exploration tool named NoCMaker. This framework leads to a rapid prototyping and validation of NoC-based MPSoCs. by extending Network Interfaces (NIs) to handle heterogeneous traffic from different bus¬based standards (e.g. AMBA, OCP-IP) in order to reuse and communicate a great variety off-the-shelf IP cores and software stacks in a transparent way from the user point of view. by providing runtime QoS features (best effort and guaranteed services) through NoC-level hardware components and software middleware routines. by exploring HW/SW interfaces and resource sharing when a Floating Point Unit (FPU) co¬processor is interfaced on a NoC-based MPSoC. by porting parallel programming models, such as shared memory or message passing models on NoC-based MPSoCs. We present the implementation of an efficient lightweight parallel programming model based on Message Passing Interface (MPI), called on-chip Message Passing Interface (ocMPI). It enables the design of parallel distributed computing at task-level or function-level using explicit parallelism and synchronization methods between the cores integrated on the chip. by provide runtime application to packets QoS support on top of the OpenMP runtime library targeted for shared memory MPSoCs in order to boost or balance critical applications or threads during its execution. The key challenges explored in this dissertation are formalized on HW-SW communication centric platform-based design methodology. Thus, the outcome of this work will be a robust cluster-on-chip platform for high-performance embedded computing, whereby hardware and software components can be reused at multiple levels of design abstraction.
|
100 |
The design and implementation of a region-based parallel programming language /Chamberlain, Bradford L., January 2001 (has links)
Thesis (Ph. D.)--University of Washington, 2001. / Vita. Includes bibliographical references (p. 362-373).
|
Page generated in 0.0935 seconds