Global ETD Search

121	Adaptive transaction scheduling for transactional memory systems Yoo, Richard M. 01 April 2008 (has links) Transactional memory systems are expected to enable parallel programming at lower programming complexity, while delivering improved performance over traditional lock-based systems. Nonetheless, there are certain situations where transactional memory systems could actually perform worse. Transactional memory systems can outperform locks only when the executing workloads contain sufficient parallelism. When the workload lacks inherent parallelism, launching excessive transactions can adversely degrade performance. These situations will actually become dominant in future workloads when large-scale transactions are frequently executed. In this thesis, we propose a new paradigm called adaptive transaction scheduling to address this issue. Based on the parallelism feedback from applications, our adaptive transaction scheduler dynamically dispatches and controls the number of concurrently executing transactions. In our case study, we show that our low-cost mechanism not only guarantees that hardware transactional memory systems perform no worse than a single global lock, but also significantly improves performance for both hardware and software transactional memory systems. Parallelism Performance Transaction effectiveness Contention intensity Transaction systems (Computer systems) Threads (Computer programs) Parallel programming (Computer science) Synchronization
122	Data flow implementations of a lucid-like programming language / by Andrew Lawrence Wendelborn Wendelborn, Andrew Lawrence January 1985 (has links) Bibliography: leaves [238]-244 / xi, 244 leaves : ill ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / Thesis (Ph.D.)--University of Adelaide, Dept. of Computer Science, 1985 001.644 19 Translators (Computer programs) Parallel programming (Computer science) Interpreters (Computer programs)
123	VCluster a portable virtual computing library for cluster computing / Zhang, Hua. January 2008 (has links) Thesis (Ph.D.)--University of Central Florida, 2008. / Advisers: Ratan K. Guha, Joohan Lee. Includes bibliographical references (p. 132-143).
124	In pursuit of NP-hard combinatorial optimization problems Ono, Satoshi. January 2009 (has links) Thesis (Ph. D.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Computer Science, 2009. / Includes bibliographical references.
125	The role of parallel computing in bioinformatics / Akhurst, Timothy John. January 2005 (has links) Thesis (M. Sc. (Biochemistry, Microbiology and Biotechnology))--Rhodes University, 2005. / Research report submitted in partial fulfilment of the requirements for the degree of Master of Science.
126	The role of parallel computing in bioinformatics Akhurst, Timothy John January 2005 (has links) The need to intelligibly capture, manage and analyse the ever-increasing amount of publicly available genomic data is one of the challenges facing bioinformaticians today. Such analyses are in fact impractical using uniprocessor machines, which has led to an increasing reliance on clusters of commodity-priced computers. An existing network of cheap, commodity PCs was utilised as a single computational resource for parallel computing. The performance of the cluster was investigated using a whole genome-scanning program written in the Java programming language. The TSpaces framework, based on the Linda parallel programming model, was used to parallelise the application. Maximum speedup was achieved at between 30 and 50 processors, depending on the size of the genome being scanned. Together with this, the associated significant reductions in wall-clock time suggest that both parallel computing and Java have a significant role to play in the field of bioinformatics. Bioinformatics Parallel programming (Computer science) LINDA (Computer system) Java (Computer program language) Genomics -- Data processing
127	Explorando memoria transacional em software nos contextos de arquiteturas assimetricas, jogos computacionais e consumo de energia / Exploiting software transactional memory in the context of asymmetric architectures Baldassin, Alexandro José 15 August 2018 (has links) Orientador: Paulo Cesar Centoducatte / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-15T20:59:38Z (GMT). No. of bitstreams: 1 Baldassin_AlexandroJose_D.pdf: 1596141 bytes, checksum: 265aa763c420b69f70d59ff687bd8ad9 (MD5) Previous issue date: 2009 / Resumo: A adoção dos microprocessadores com múltiplos núcleos de execução pela indústria semicondutora tem criado uma crescente necessidade por novas linguagens, metodologias e ferramentas que tornem o desenvolvimento de sistemas concorrentes mais rápido, eficiente e acessível aos programadores de todos os níveis. Uma das principais dificuldades em programação concorrente com memória compartilhada é garantir a correta sincronização do código, evitando assim condições de corrida que podem levar o sistema a um estado inconsistente. A sincronização tem sido tradicionalmente realizada através de métodos baseados em travas, reconhecidos amplamente por serem de difícil uso e pelas anomalias causadas. Um novo mecanismo, conhecido como memória transacional (TM), tem sido alvo de muita pesquisa recentemente e promete simplificar o processo de sincronização, além de possibilitar maior oportunidade para extração de paralelismo e consequente desempenho. O cerne desta tese é formado por três trabalhos desenvolvidos no contexto dos sistemas de memória transacional em software (STM). Primeiramente, apresentamos uma implementação de STM para processadores assimétricos, usando a arquitetura Cell/B.E. como foco. Como principal resultado, constatamos que o uso de sistemas transacionais em arquiteturas assimétricas também é promissor, principalmente pelo fator escalabilidade. No segundo trabalho, adotamos uma abordagem diferente e sugerimos um sistema de STM especialmente voltado para o domínio de jogos computacionais. O principal motivo que nos levou nesta direção é o baixo desempenho das implementações atuais de STM. Um estudo de caso conduzido a partir de um jogo complexo mostra a eficácia do sistema proposto. Finalmente, apresentamos pela primeira vez uma caracterização do consumo de energia de um sistema de STM considerado estado da arte. Além da caracterização, também propomos uma técnica para redução do consumo em casos de alta contenção. Resultados obtidos a partir dessa técnica revelam ganhos de até 87% no consumo de energia / Abstract: The shift towards multicore processors taken by the semiconductor industry has initiated an era in which new languages, methodologies and tools are of paramount importance to the development of efficient concurrent systems that can be built in a timely way by all kinds of programmers. One of the main obstacles faced by programmers when dealing with shared memory programming concerns the use of synchronization mechanisms so as to avoid race conditions that could possibly lead the system to an inconsistent state. Synchronization has been traditionally achieved by means of locks (or variations thereof), widely known by their anomalies and hard-to-get-it-right facets. A new mechanism, known as transactional memory (TM), has recently been the focus of a lot of research and shows potential to simplify code synchronization as well as delivering more parallelism and, therefore, better performance. This thesis presents three works focused on different aspects of software transactional memory (STM) systems. Firstly, we show an STM implementation for asymmetric processors, focusing on the architecture of Cell/B.E. As an important result, we find out that memory transactions are indeed promising for asymmetric architectures, specially due to their scalability. Secondly, we take a different approach to STM implementation by devising a system specially targeted at computer games. The decision was guided by poor performance figures usually seen on current STM implementations. We also conduct a case study using a complex game that effectively shows the system's efficiency. Finally, we present the energy consumption characterization of a state-of-the-art STM for the first time. Based on the observed characterization, we also propose a technique aimed at reducing energy consumption in highly contended scenarios. Our results show that the technique is indeed effective in such cases, improving the energy consumption by up to 87% / Doutorado / Sistemas de Computação / Doutor em Ciência da Computação Memória transacional Programação paralela (Computação) Arquitetura de computador Estimativa de potência Transactional memory Parallel programming (Computer science) Computer architecture Power estimation
128	Investigating tools and techniques for improving software performance on multiprocessor computer systems Tristram, Waide Barrington January 2012 (has links) The availability of modern commodity multicore processors and multiprocessor computer systems has resulted in the widespread adoption of parallel computers in a variety of environments, ranging from the home to workstation and server environments in particular. Unfortunately, parallel programming is harder and requires more expertise than the traditional sequential programming model. The variety of tools and parallel programming models available to the programmer further complicates the issue. The primary goal of this research was to identify and describe a selection of parallel programming tools and techniques to aid novice parallel programmers in the process of developing efficient parallel C/C++ programs for the Linux platform. This was achieved by highlighting and describing the key concepts and hardware factors that affect parallel programming, providing a brief survey of commonly available software development tools and parallel programming models and libraries, and presenting structured approaches to software performance tuning and parallel programming. Finally, the performance of several parallel programming models and libraries was investigated, along with the programming effort required to implement solutions using the respective models. A quantitative research methodology was applied to the investigation of the performance and programming effort associated with the selected parallel programming models and libraries, which included automatic parallelisation by the compiler, Boost Threads, Cilk Plus, OpenMP, POSIX threads (Pthreads), and Threading Building Blocks (TBB). Additionally, the performance of the GNU C/C++ and Intel C/C++ compilers was examined. The results revealed that the choice of parallel programming model or library is dependent on the type of problem being solved and that there is no overall best choice for all classes of problem. However, the results also indicate that parallel programming models with higher levels of abstraction require less programming effort and provide similar performance compared to explicit threading models. The principle conclusion was that the problem analysis and parallel design are an important factor in the selection of the parallel programming model and tools, but that models with higher levels of abstractions, such as OpenMP and Threading Building Blocks, are favoured. Multiprocessors Multiprogramming (Electronic computers) Parallel programming (Computer science) Linux Abstract data types (Computer science) Threads (Computer programs) Computer programming
129	New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing Zhong, Peilin January 2021 (has links) We study fundamental graph problems under parallel computing models. In particular, we consider two parallel computing models: Parallel Random Access Machine (PRAM) and Massively Parallel Computation (MPC). The PRAM model is a classic model of parallel computation. The efficiency of a PRAM algorithm is measured by its parallel time and the number of processors needed to achieve the parallel time. The MPC model is an abstraction of modern massive parallel computing systems such as MapReduce, Hadoop and Spark. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of local memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. We usually desire fully scalable MPC algorithms, i.e., algorithms that can work for any local memory size. The efficiency of a fully scalable MPC algorithm is measured by its parallel time and the total space usage (the local memory size times the number of machines). Consider an 𝑛-vertex 𝑚-edge undirected graph 𝐺 (either weighted or unweighted) with diameter 𝐷 (the largest diameter of its connected components). Let 𝑁=𝑚+𝑛 denote the size of 𝐺. We present a series of efficient (randomized) parallel graph algorithms with theoretical guarantees. Several results are listed as follows: 1) Fully scalable MPC algorithms for graph connectivity and spanning forest using 𝑂(𝑁) total space and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 2) Fully scalable MPC algorithms for 2-edge and 2-vertex connectivity using 𝑂(𝑁) total space where 2-edge connectivity algorithm needs 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time, and 2-vertex connectivity algorithm needs 𝑂(log 𝐷⸱log²log_{𝑁/𝑛} n+\log D'⸱loglog_{𝑁/𝑛} 𝑛) parallel time. Here 𝐷' denotes the bi-diameter of 𝐺. 3) PRAM algorithms for graph connectivity and spanning forest using 𝑂(𝑁) processors and 𝑂(log 𝐷loglog_{𝑁/𝑛} 𝑛) parallel time. 4) PRAM algorithms for (1 + 𝜖)-approximate shortest path and (1 + 𝜖)-approximate uncapacitated minimum cost flow using 𝑂(𝑁) processors and poly(log 𝑛) parallel time. These algorithms are built on a series of new graph algorithmic primitives which may be of independent interests. Computer science Computer algorithms Parallel programming (Computer science) SPARK (Computer program language) MapReduce (Computer file) Apache Hadoop
130	Extending Relativistic Programming to Multiple Writers Howard, Philip William 01 January 2012 (has links) For software to take advantage of modern multicore processors, it must be safely concurrent and it must scale. Many techniques that allow safe concurrency do so at the expense of scalability. Coarse grain locking allows multiple threads to access common data safely, but not at the same time. Non-Blocking Synchronization and Transactional Memory techniques optimistically allow concurrency, but only for disjoint accesses and only at a high performance cost. Relativistic programming is a technique that allows low overhead readers and joint access parallelism between readers and writers. Most of the work on relativistic programming has assumed a single writer at a time (or, in partitionable data structures, a single writer per partition), and single writer solutions cannot scale on the write side. This dissertation extends prior work on relativistic programming in the following ways: 1) It analyses the ordering requirements of lock-based and relativistic programs in order to clarify the differences in their correctness and performance characteristics, and to define precisely the behavior required of the relativistic programming primitives. 2) It shows how relativistic programming can be used to construct efficient, scalable algorithms for complex data structures whose update operations involve multiple writes to multiple nodes. 3) It shows how disjoint access parallelism can be supported for relativistic writers, using Software Transactional Memory, while still allowing low-overhead, linearly-scalable, relativistic reads. Concurrency Relativistic programming Data structures Synchronization Multicore Multiprocessors -- Programming Systems programming (Computer science) Parallel programming (Computer science)

Search results