Global ETD Search

1	Seven aspects of software transactional memory Hughes, Thomas Francis 13 August 2010 (has links) This paper explores different aspects of transactional memory to identify general patterns and analyze what direction software transactional memory research may be headed. Hybrid hardware-accelerated transactional memory is shown as a better long-term solution than purely software or hardware transactional memory, based on performance and the fundamental issue of software complexity. The appendix provides a chronologically ordered summary of significant transactional memory implementations and transactional memory specific benchmarks. / text Transactional memory
2	Programming frameworks for performance driven speculative parallelization Ravichandran, Kaushik 12 January 2015 (has links) Effectively utilizing available parallelism is becoming harder and harder as systems evolve to many-core processors with many tens of cores per chip. Automatically extracting parallelism has limitations whereas completely redesigning software using traditional parallel constructs is a daunting task that significantly jeopardizes programmer productivity. On the other hand, many studies have shown that a good amount of parallelism indeed exists in sequential software that remains untapped. How to unravel and utilize it successfully remains an open research question. Speculation fortunately provides a potential answer to this question. Speculation provides a golden bridge for a quick expression of "potential" parallelism in a given program. While speculation at extremely fine granularities has been shown to provide good speed-ups, speculation at larger granularities has only been attempted on a very small scale due to the potentially large overheads that render it useless. The transactional construct used by STMs can be used by programmers to express speculation since it provides atomicity and isolation while writing parallel code. However, it was not designed to deal with the semantics of speculation. This thesis contends that by incorporating the semantics of speculation new solutions can be constructed and speculation can provide a powerful means to the hard problem of efficiently utilizing many-cores with very low programmer efforts. This thesis takes a multi-faceted view of the problem of speculation through a combination of programming models, compiler analysis, scheduling and runtime systems and tackles the semantic issues that surround speculation such as determining the right degree of speculation to maximize performance, reuse of state in rollbacks, providing probabilistic guidance for minimizing conflicts, deterministic execution for debugging and development, and providing very large scale speculations across distributed nodes. First, we present F2C2-STM, a high performance flux-based feedback-driven concurrency control technique which automatically selects and adapts the degree of speculation in transactional applications for best performance. Second, we present the Merge framework which is capable of salvaging useful work performed during an incorrect speculation and incorporates it towards the final commit. Third, we present a framework which has the ability to leverage semantics of data structures and algorithmic properties to guide the scheduling of concurrent speculative transactions to minimize conflicts and performance loss. Fourth, we present DeSTM, a deterministic STM designed to aid the development of speculative transactional applications for repeatability without undue performance loss. These contributions significantly enhance the use of transactional memory as a speculative idiom improving the efficiency of speculative execution as well as simplify the development process. Finally, we focus on a performance oriented view of speculation, namely choose one of many speculative semantics, dubbed as algorithmic speculation. We present, the Multiverse framework which scales algorithmic speculation to a large distributed cluster with thousands of cores while maintaining its simplicity and efficiency. To conclude, speculative algorithms are benefited by the contributions of this thesis due to the enhancements to the transactional and the algorithmic speculative paradigms developed in this work, laying the foundation for the development and tuning of new speculative algorithms. Speculation Transactional memory
3	Validity contracts for software transactions Nguyen, Quan Hoang, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links) Software Transactional Memory is a promising approach to concurrent program- ming, freeing programmers from error-prone concurrency control decisions that are complicated and not composable. But few such systems address consistencies of transactional objects. In this thesis, I propose a contract-based transactional programming model toward more secure transactional sofwares. In this general model, a validity contract spec- ifies both requirements and effects for transactions. Validity contracts bring nu- merous benefits including reasoning about and verifying transactional programs, detecting and resolving transactional conflicts, automating object revalidation and easing program debugging. I introduce an ownership-based framework, namely AVID, derived from the gen- eral model, using object ownership as a mechanism for specifying and reasoning validity contracts. I have specified a formal type system and implemented a pro- totype type checker to support static checking. I also have built a transactional library framework AVID, based on existing Java DSTM2 framework, for express- ing transactions and validity contracts. Experimental results on a multi-core system show that contracts add little over- heads to the original STM. I find that contract-aware contention management yields significant speedups in some cases. The results have suggested compiler- directed optimisation for tunning contract-based transactional programs. My further work will investigate the applications of transaction contracts on various aspects of TM research such as hardware support and open-nesting. Validity Transactional memory Design by contract Software transactional memory
4	Understanding and Improving Bloom Filter Configuration for Lazy Address-set Disambiguation Jeffrey, Mark 08 December 2011 (has links) Many parallelization systems detect memory access conflicts across concurrent threads by disambiguating address-sets using bit-vector-based Bloom filters, which are efficient, but can report false conflicts that do not exist. Systems with lazy conflict detection often use Bloom filters unconventionally by testing sets for null-intersection via Bloom filter intersection, contrasting with the conventional approach of issuing membership queries into the Bloom filter. In this dissertation we develop much-needed theory for probability of false conflicts in Bloom filter null-intersection tests, notably demonstrating that Bloom filter intersection requires substantially larger bit-vectors to provide equivalent statistical behavior to querying. Furthermore, we recognize that our theoretical implications counter practical intuition, and thus use RingSTM to evaluate theory in practice by implementing and comparing the Bloom filter configurations. We find that despite its overheads, the queue-of-queries approach reduces execution time and is thus the most compelling alternative to Bloom filter intersection for lazy address-set disambiguation. Bloom filter Transactional Memory 0544
5	Understanding and Improving Bloom Filter Configuration for Lazy Address-set Disambiguation Jeffrey, Mark 08 December 2011 (has links) Many parallelization systems detect memory access conflicts across concurrent threads by disambiguating address-sets using bit-vector-based Bloom filters, which are efficient, but can report false conflicts that do not exist. Systems with lazy conflict detection often use Bloom filters unconventionally by testing sets for null-intersection via Bloom filter intersection, contrasting with the conventional approach of issuing membership queries into the Bloom filter. In this dissertation we develop much-needed theory for probability of false conflicts in Bloom filter null-intersection tests, notably demonstrating that Bloom filter intersection requires substantially larger bit-vectors to provide equivalent statistical behavior to querying. Furthermore, we recognize that our theoretical implications counter practical intuition, and thus use RingSTM to evaluate theory in practice by implementing and comparing the Bloom filter configurations. We find that despite its overheads, the queue-of-queries approach reduces execution time and is thus the most compelling alternative to Bloom filter intersection for lazy address-set disambiguation. Bloom filter Transactional Memory 0544
6	Transactions Everywhere Kuszmaul, Bradley C., Leiserson, Charles E. 01 1900 (has links) Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most basic problem inherent in the coordination of concurrent tasks is the enforcing of atomicity so that the partial results of one task do not inadvertently corrupt another task. Atomicity is typically enforced through locking protocols, but these protocols can introduce other complications, such as deadlock, unless restrictive methodologies in their use are adopted. We have recently begun a research project focusing on transactional memory [18] as an alternative mechanism for enforcing atomicity, since it allows the user to avoid many of the complications inherent in locking protocols. Rather than viewing transactions as infrequent occurrences in a program, as has generally been done in the past, we have adopted the point of view that all user code should execute in the context of some transaction. To make this viewpoint viable requires the development of two key technologies: effective hardware support for scalable transactional memory, and linguistic and compiler support. This paper describes our preliminary research results on making “transactions everywhere” a practical reality. / Singapore-MIT Alliance (SMA) transactional memory scalable transactional memory atomicity parallel programming hardware transactional memory linguistic and compiler support
7	Automatic skeleton-driven performance optimizations for transactional memory Wanderley Goes, Luis Fabricio January 2012 (has links) The recent shift toward multi-core chips has pushed the burden of extracting performance to the programmer. In fact, programmers now have to be able to uncover more coarse-grain parallelism with every new generation of processors, or the performance of their applications will remain roughly the same or even degrade. Unfortunately, parallel programming is still hard and error prone. This has driven the development of many new parallel programming models that aim to make this process efficient. This thesis first combines the skeleton-based and transactional memory programming models in a new framework, called OpenSkel, in order to improve performance and programmability of parallel applications. This framework provides a single skeleton that allows the implementation of transactional worklist applications. Skeleton or pattern-based programming allows parallel programs to be expressed as specialized instances of generic communication and computation patterns. This leaves the programmer with only the implementation of the particular operations required to solve the problem at hand. Thus, this programming approach simplifies parallel programming by eliminating some of the major challenges of parallel programming, namely thread communication, scheduling and orchestration. However, the application programmer has still to correctly synchronize threads on data races. This commonly requires the use of locks to guarantee atomic access to shared data. In particular, lock programming is vulnerable to deadlocks and also limits coarse grain parallelism by blocking threads that could be potentially executed in parallel. Transactional Memory (TM) thus emerges as an attractive alternative model to simplify parallel programming by removing this burden of handling data races explicitly. This model allows programmers to write parallel code as transactions, which are then guaranteed by the runtime system to execute atomically and in isolation regardless of eventual data races. TM programming thus frees the application from deadlocks and enables the exploitation of coarse grain parallelism when transactions do not conflict very often. Nevertheless, thread management and orchestration are left for the application programmer. Fortunately, this can be naturally handled by a skeleton framework. This fact makes the combination of skeleton-based and transactional programming a natural step to improve programmability since these models complement each other. In fact, this combination releases the application programmer from dealing with thread management and data races, and also inherits the performance improvements of both models. In addition to it, a skeleton framework is also amenable to skeleton-driven performance optimizations that exploits the application pattern and system information. This thesis thus also presents a set of pattern-oriented optimizations that are automatically selected and applied in a significant subset of transactional memory applications that shares a common pattern called worklist. These optimizations exploit the knowledge about the worklist pattern and the TM nature of the applications to avoid transaction conflicts, to prefetch data, to reduce contention etc. Using a novel autotuning mechanism, OpenSkel dynamically selects the most suitable set of these pattern-oriented performance optimizations for each application and adjusts them accordingly. Experimental results on a subset of five applications from the STAMP benchmark suite show that the proposed autotuning mechanism can achieve performance improvements within 2%, on average, of a static oracle for a 16-core UMA (Uniform Memory Access) platform and surpasses it by 7% on average for a 32-core NUMA (Non-Uniform Memory Access) platform. Finally, this thesis also investigates skeleton-driven system-oriented performance optimizations such as thread mapping and memory page allocation. In order to do it, the OpenSkel system and also the autotuning mechanism are extended to accommodate these optimizations. The conducted experimental results on a subset of five applications from the STAMP benchmark show that the OpenSkel framework with the extended autotuning mechanism driving both pattern and system-oriented optimizations can achieve performance improvements of up to 88%, with an average of 46%, over a baseline version for a 16-core UMA platform and up to 162%, with an average of 91%, for a 32-core NUMA platform. 005.3
8	A Study of Conflict Detection in Software Transactional Memory Lupei, Daniel 15 February 2010 (has links) Transactional Memory (TM) has been proposed as a simpler parallel programming model compared to the traditional locking model. However, uptake from the programming community has been slow, primarily because performance issues of software-based TM strategies are not well understood. In this thesis we conduct a systematic analysis of conflict scenarios that may emerge when enforcing correctness between conflicting transactions. We find that some combinations of conflict detection and resolution strategies perform better than others depending on the conflict patterns in the application. We validate our findings by implementing several concurrency control strategies, and by measuring their relative performance. Based on these observations, we introduce partial rollbacks as a mechanism for effectively compensating the variability in the TM algorithm performance. We show that using this mechanism we can obtain close to the overall best performance for a range of conflict patterns in a synthetically generated workload and a realistic game application. software transactional memory concurrency control 0984
9	A Study of Conflict Detection in Software Transactional Memory Lupei, Daniel 15 February 2010 (has links) Transactional Memory (TM) has been proposed as a simpler parallel programming model compared to the traditional locking model. However, uptake from the programming community has been slow, primarily because performance issues of software-based TM strategies are not well understood. In this thesis we conduct a systematic analysis of conflict scenarios that may emerge when enforcing correctness between conflicting transactions. We find that some combinations of conflict detection and resolution strategies perform better than others depending on the conflict patterns in the application. We validate our findings by implementing several concurrency control strategies, and by measuring their relative performance. Based on these observations, we introduce partial rollbacks as a mechanism for effectively compensating the variability in the TM algorithm performance. We show that using this mechanism we can obtain close to the overall best performance for a range of conflict patterns in a synthetically generated workload and a realistic game application. software transactional memory concurrency control 0984
10	Data centric and adaptive source changing transactional memory with exit functionality Herath, Herath Mudiyanselage Isuru Prasenajith January 2012 (has links) Multi-core computing is becoming ubiquitous due to the scaling limitations of single-core computing. It is inevitable that parallel programming will become the mainstream for such processors. In this paradigm shift, the concept of abstraction should not be compromised. A programming model serves as an abstraction of how programs are executed. Transactional Memory (TM) is a technique proposed to maintain lock free synchronization. Due to the simplicity of the abstraction provided by it, TM can also be used as a way of distributing parallel work, maintaining coherence and consistency. Motivated by this, at a higher level, the thesis makes three contributions and all are centred around Hardware Transactional Memory (HTM).As the first contribution, a transaction-only architecture is coupled with a ``data centric" approach, to address the scalability issues of the former whilst maintaining its simplicity. This is achieved by grouping together memory locations having similar access patterns and maintaining coherence and consistency according to the group each memory location belongs to. As the second contribution a novel technique is proposed to reduce the number of false transaction aborts which occur in a signature based HTM. The idea is to adaptively switch between cache lines and signatures to detect conflicts. That is, when a transaction fits in the L1 cache, cache line information is used to detect conflicts and signatures are used otherwise. As the third contribution, the thesis makes a case for having an exit functionality in an HTM. The objective of the proposed functionality, TM_EXIT, is to terminate a transaction without restarting or committing. 004

Search results