Spelling suggestions: "subject:"software transaction memory""
1 |
A Study of Conflict Detection in Software Transactional MemoryLupei, Daniel 15 February 2010 (has links)
Transactional Memory (TM) has been proposed as a simpler parallel
programming model compared to the traditional locking model. However,
uptake from the programming community has been slow, primarily because
performance issues of software-based TM strategies are not well understood.
In this thesis we conduct a systematic analysis of conflict scenarios
that may emerge when enforcing correctness between conflicting transactions.
We find that some combinations of conflict detection and resolution
strategies perform better than others depending on the conflict patterns
in the application. We validate our findings by implementing several
concurrency control strategies, and by measuring their relative
performance.
Based on these observations, we introduce partial rollbacks as a
mechanism for effectively compensating the variability in the TM algorithm
performance. We show that using this mechanism we can
obtain close to the overall best performance for a range of conflict
patterns in a synthetically generated workload and a realistic game application.
|
2 |
A Study of Conflict Detection in Software Transactional MemoryLupei, Daniel 15 February 2010 (has links)
Transactional Memory (TM) has been proposed as a simpler parallel
programming model compared to the traditional locking model. However,
uptake from the programming community has been slow, primarily because
performance issues of software-based TM strategies are not well understood.
In this thesis we conduct a systematic analysis of conflict scenarios
that may emerge when enforcing correctness between conflicting transactions.
We find that some combinations of conflict detection and resolution
strategies perform better than others depending on the conflict patterns
in the application. We validate our findings by implementing several
concurrency control strategies, and by measuring their relative
performance.
Based on these observations, we introduce partial rollbacks as a
mechanism for effectively compensating the variability in the TM algorithm
performance. We show that using this mechanism we can
obtain close to the overall best performance for a range of conflict
patterns in a synthetically generated workload and a realistic game application.
|
3 |
Refactoring for Software Transactional MemoryBaum, Mark Vincent 27 February 2012 (has links)
Software transactional memory (STM) is an optimistic concurrent lock free mechanism that has the potential of positively transforming how concurrent programming is performed. STM, despite its many desirable attributes, is not yet a ubiquitous programming language feature in the commercial software domain. There are many implementation challenges with retrofitting STM into pre-existing language frameworks. Furthermore, existing software systems will also need to be refactored in order to take advantage of STM’s unique benefits. As with other time consuming and error prone refactoring processes, refactoring for STM is best done with automated tool support; it is the aim of this paper to propose such a tool. / text
|
4 |
Validity contracts for software transactionsNguyen, Quan Hoang, Computer Science & Engineering, Faculty of Engineering, UNSW January 2009 (has links)
Software Transactional Memory is a promising approach to concurrent program- ming, freeing programmers from error-prone concurrency control decisions that are complicated and not composable. But few such systems address consistencies of transactional objects. In this thesis, I propose a contract-based transactional programming model toward more secure transactional sofwares. In this general model, a validity contract spec- ifies both requirements and effects for transactions. Validity contracts bring nu- merous benefits including reasoning about and verifying transactional programs, detecting and resolving transactional conflicts, automating object revalidation and easing program debugging. I introduce an ownership-based framework, namely AVID, derived from the gen- eral model, using object ownership as a mechanism for specifying and reasoning validity contracts. I have specified a formal type system and implemented a pro- totype type checker to support static checking. I also have built a transactional library framework AVID, based on existing Java DSTM2 framework, for express- ing transactions and validity contracts. Experimental results on a multi-core system show that contracts add little over- heads to the original STM. I find that contract-aware contention management yields significant speedups in some cases. The results have suggested compiler- directed optimisation for tunning contract-based transactional programs. My further work will investigate the applications of transaction contracts on various aspects of TM research such as hardware support and open-nesting.
|
5 |
Compiler Transformations For Improving The Performance Of Software Transactional MemoryMannarswamy, Sandya S 11 1900 (has links) (PDF)
Expressing synchronization using traditional lock based primitives has been found to be both error-prone and restrictive. Hence there has been considerable research work to develop scalable and programmer-friendly alternatives to lock-based synchronization. Atomic sections have been proposed as a programming idiom for expressing synchronization at a higher-level of abstraction than locks.
One way of supporting atomic sections in software is to rely on an underlying Software Transactional Memory (STM) implementation. While STM offers the promise of being a programming paradigm which is less error-prone and more programmer friendly compared to traditional lock-based synchronization, it also needs to be competitive in performance in order for it to be adopted in mainstream software. Unfortunately STMs do not meet the performance goals and are known to incur excessive performance overheads.
Prior work by other researchers and our performance analysis of STM applications show that conflicts and the resulting aborts are a major performance bottleneck for STM applications. Second we find that, supporting fine-grained optimistic concurrency can have significant impact on the cache behavior of applications running on STM and hence can adversely affect STM performance. Our systematic quantitative analysis of the cache behavior of STM applications as well as prior work on qualitative analysis of STM overheads show that cache overheads constitute a major performance bottleneck for STM applications. Hence in this thesis, we focus on addressing these two major STM performance bottlenecks.
Current STM implementations are typically application unaware in that they do not
analyze the application and use that knowledge to improve the application performance on STM. Closer integration of transactions with the programming languages opens up the possibility of using the compiler to analyze STM applications and using that knowledge to perform application code transformations to improve the application performance on STM automatically and in a manner transparent to the programmer. This motivated us to address the two major STM performance bottlenecks namely poor cache performance and performance penalty due to aborts, by compiler transformations.
In order to pinpoint the cache bottlenecks of STM, we perform a detailed experimental evaluation of the cache behavior of STM applications and quantify the impact of the different STM factors on the cache misses experienced by the applications. We propose a set of compiler transformations targeted to address the cache performance bottlenecks identified by our analysis. Next we turn our attention to compiler analysis and transformations targeted at reducing the performance overheads due to transactional aborts, effectively utilizing the compiler’s knowledge of the data access patterns of the application. Since not all applications are designed with optimistic concurrency in mind, real world applications typically contain certain atomic sections which are not amenable to STM’s optimistic concurrency control and hence suffer from excessive transactional abort overheads. We propose two compiler techniques for handling such atomic sections.
Another major cause of transactional conflicts leading to unnecessary aborts is the uniform granularity access tracking scheme employed by STM implementations. Using a single uniform access tracking granularity leads to poor lock assignment by STM. We propose techniques which use compiler’s knowledge of an application to improve the application unaware lock assignment made by the STM. Last as transactional abort overheads impact STM performance adversely, we propose a compiler-based approach to reduce the transactional abort overheads by reconciling certain kinds of transactions instead of aborting them and then performing a complete re-execution. We show that our combined set of compiler transformations are effective in improving the performance of a set of STAMP benchmarks by reducing the execution time by 7.48% to 54.82%, aborts by 8.98% to 56.61% and the average D-cache miss latency by up to 33.51%.
|
6 |
Real-Time Software Transactional Memory: Contention Managers, Time Bounds, and ImplementationsEl-Shambakey, Mohammed Talat 02 October 2013 (has links)
Lock-based concurrency control suffers from programmability, scalability, and composability challenges. These challenges are exacerbated in emerging multicore architectures, on which improved software performance must be achieved by exposing greater concurrency. Transactional memory (TM) is an emerging alternative synchronization model for shared memory objects that promises to alleviate these difficulties.
In this dissertation, we consider software transactional memory (STM) for concurrency control in multicore real-time software, and present a suite of real-time STM contention managers for resolving transactional conflicts. The contention managers are called ECM, RCM, LCM, PNF, and FBLT. RCM and ECM resolve conflicts using fixed and dynamic priorities of real-time tasks, respectively, and are naturally intended to be used with the fixed priority (e.g., G-RMA) and dynamic priority (e.g., G-EDF) multicore real-time schedulers, respectively. LCM resolves conflicts based on task priorities as well as atomic section lengths, and can be used with G-EDF or G-RMA schedulers. Transactions under ECM, RCM, and LCM may retry due to conflicts with higher priority tasks even when there are no shared objects, i.e., transitive retry. PNF avoids transitive retry and optimizes processor usage by lowering the priority of retrying transactions, thereby enabling other non-conflicting transactions to proceed. PNF, however, requires a priori knowledge of all requested objects for each atomic section, which is inconsistent with the semantics of dynamic STM. Moreover, its centralized design increases overhead. FBLT avoids transitive retry, do not require a priori knowledge of requested objects, and has a decentralized design.
We establish upper bounds on transactional retry costs and task response times under the contention managers through schedulability analysis. Since ECM and RCM preserve the semantics of the underlying real-time scheduler, their maximum transactional retry cost is double the maximum atomic section length. This is improved in the design of LCM, which achieves shorter retry costs and tighter upper bounds. As PNF avoids transitive retry and improves processor usage, it yields shorter retry costs and tighter upper bounds than ECM, RCM, and LCM. FBLT\'s upper bounds are similarly tight because it combines the advantages of PNF and LCM.
We formally compare the proposed contention managers with each other, with lock-free synchronization, and with multiprocessor real-time locking protocols. Our analysis reveals that, for most cases, ECM, RCM, and LCM achieve higher schedulability than lock-free synchronization only when the atomic section length does not exceed half of lock-free synchronization\'s retry loop length. With equal periods and greater access times for shared objects, atomic section length under ECM, RCM, and LCM can be much larger than the retry loop length while still achieving better schedulability. With proper values for LCM\'s design parameters, atomic section length can be larger than the retry loop length for better schedulability. Under PNF, atomic section length can exceed lock-free\'s retry loop length and still achieve better schedulability in certain cases. FBLT achieves equal or better schedulability than lock-free with appropriate values for design parameters. The schedulability advantage of the contention managers over multiprocessor real-time locking protocols such as Global OMLP and RNLP depends upon the value of $s_{max}/L_{max}$, the ratio of the maximum transaction length to the maximum critical section length. FBLT\'s schedulability is equal or better than Global OMLP and RNLP if $s_/L_ le 2$.
Checkpointing enables partial roll-back of transactions by recording transaction execution states (i.e., checkpoints) during execution, allowing roll-back to a previous checkpoint instead of transaction start, improving task response time. We extend FBLT with checkpointing and develop CP-FBLT, and identify the conditions under which CP-FBLT achieves equal or better schedulability than FBLT.
We implement the contention managers in the Rochester STM framework and conduct experimental studies using a multicore real-time Linux kernel. Our studies reveal that among the contention managers, CP-FBLT has the best average-case performance. CP-FBLT\'s higher performance is due to the fact that PNF\'s and LCM\'s advantages are combined into the design of FBLT, which is the base of CP-FBLT. Moreover, checkpointing improves task response time. The contention managers were also found to have equal or better average-case performance than lock-free synchronization: more jobs meet their deadlines using CP-FBLT, FBLT, and PNF than lock-free synchronization by 34.6%, 28.5%, and 32.4% (on average), respectively. The superiority of the contention managers is directly due to their better conflict resolution policies.
Locking protocols such as OMLP and RNLP were found to perform better: more jobs meet their deadlines under OMLP and RNLP than any contention manager by 12.4% and 13.7% (on average), respectively. However, the proposed contention managers have numerous qualitative advantages over locking protocols. Locks do not compose, whereas STM transactions do. To allow multiple objects to be accessed in a critical section, OMLP assigns objects to non-conflicting groups, where each group is protected by a distinct lock. RNLP assumes that objects are accessed in a specific order to prevent deadlocks. In contrast, STM allows multiple objects to be accessed in a transaction in any order, while guaranteeing deadlock-freedom, which significantly increases programmability. Moreover, STM offers platform independence: the proposed contention managers can be entirely implemented in the user-space as a library. In contrast, real-time locking protocols such as OMLP and RNLP must be supported by the underlying platform (i.e., operating system or virtual machine). / Ph. D.
|
7 |
On Partial Aborts and Reducing Validation Costs in Fault-tolerant Distributed Transactional MemoryDhoke, Aditya Anil 30 September 2013 (has links)
Distributed Transactional Memory (DTM) is an emerging synchronization abstraction thatpromises to alleviate the scalability, programmability, and composability challenges of lock-based distributed synchronization. With DTM, programmers organize code that read/writeshared memory objects, both local and remote, as memory transactions. An underlying DTMframework guarantees atomicity, isolation, and consistency properties for those transactionsthrough speculative concurrency control. In DTM, restarting an aborted transaction from the beginning can degrade performance astransactional conflicts may have occurred in the later part of the transaction, wasting work.The partial abort method alleviates this difficulty by enabling a transaction to be rolled backto the point where objects in the transaction's read-set and write-set are still consistent.In this thesis, we present protocols for supporting partial aborts in QR-DTM, a fault-tolerant DTM that uses quorum protocols for distributed concurrency control in the presence offailures. We focus on two partial abort models: closed nesting, which allows transactions to be nested and inner transactions to be rolled back without rolling back outer transactions;and checkpointing, which allows the transaction state to be saved in checkpoints throughout execution and transactions to be rolled back to those checkpoints. We present protocols that support these partial abort models in QR-DTM, called QR-CN and QR-CHK. We implemented these protocols and conducted experimental studies using macro-benchmarks(e.g., distributed versions of STAMP benchmark), and micro-benchmarks (e.g., distributed data structures). Our studies reveal that QR-CN improves throughput by as much as 101% over flat nesting in specific cases, with an average improvement of 53%. We also develop QR-ACN, a framework that automatically decomposes programmer-writtentransactions into closed nested transactions, at run-time, to improve performance. The composition of a closed nested transaction depends on the contention of objects, which can change at run-time depending upon the workload at hand. Our implementation and experimental studies reveal that QR-ACN consistently outperforms flat nesting by an averageof 51% on benchmarks including TPC-C. False conflicts occur when high-level operations, even though semantically independent, traverse the same set of objects during transaction execution. Such conflicts can lead torepeated aborts, increasing transaction execution time and degrading performance, which can be significant in DTM, since transaction execution also includes network communication. ii We consider the approach of reducing validation cost for resolving false conflicts. We presentthree protocols for reducing validation cost in DTM. Our first protocol, QR-ON, incorporatesthe open nesting model into QR-DTM. Open nesting allows inner-nested transactions tocommit independently of their parent transaction, releasing objects in the transaction read-set and write-set early, minimizing aborts due to false conflicts. We then present QR-OON,in which open-nested transactions commit asynchronously so that subsequent transactionscan proceed without waiting for the commit of previous transactions. Finally, we presentan early release methodology, QR-ER, in which objects that do not affect the final state ofthe shared data are dropped from the transaction's read-set, which avoids false conflicts andreduces communication costs. Our implementation and experimental studies revealed that QR-OON outperforms QR-ON by up to 43%, and that QR-ER outperforms QR-ON and QR-OON by up to 10% on micro- and macro-benchmarks. / Master of Science
|
8 |
ByteSTM: Java Software Transactional Memory at the Virtual Machine LevelMahmoud Mohamedin, Mohamed Ahmed 21 March 2012 (has links)
As chip vendors are increasingly manufacturing a new generation of multi-processor chips called multicores, improving software performance requires exposing greater concurrency in software. Since code that must be run sequentially is often due to the need for synchronization, the synchronization abstraction has a significant effect on program performance. Lock-based synchronization — the most widely used synchronization method — suffers from programability, scalability, and composability challenges.
Transactional memory (TM) is an emerging synchronization abstraction that promises to alleviate the difficulties with lock-based synchronization. With TM, code that read/write shared memory objects is organized as transactions, which speculatively execute. When two transactions conflict (e.g., read/write, write/write), one of them is aborted, while the other commits, yielding (the illusion of) atomicity. Aborted transactions are re-started, after rolling-back changes made to objects. In addition to a simple programming model, TM provides performance comparable to lock-based synchronization. Software transactional memory (STM) implements TM entirely in software, without any special hardware support, and is usually implemented as a library, or supported by a compiler or by a virtual machine.
In this thesis, we present ByteSTM, a virtual machine-level Java STM implementation. ByteSTM implements two STM algorithms, TL2 and RingSTM, and transparently supports implicit transactions. Program bytecode is automatically modified to support transactions: memory load/store bytecode instructions automatically switch to transactional mode when a transaction starts, and switch back to normal mode when the transaction successfully commits. Being implemented at the VM-level, it accesses memory directly and uses absolute memory addresses to uniformly handle memory. Moreover, it avoids Java garbage collection (which has a negative impact on STM performance), by manually allocating and recycling memory for transactional metadata. ByteSTM uses field-based granularity, and uses the thread header to store transactional metadata, instead of the slower Java ThreadLocal abstraction.
We conducted experimental studies comparing ByteSTM with other state-of-the-art Java STMs including Deuce, ObjectFabric, Multiverse, DSTM2, and JVSTM on a set of micro- benchmarks and macro-benchmarks. Our results reveal that, ByteSTM's transactional throughput improvement over competitors ranges from 20% to 75% on micro-benchmarks and from 36% to 100% on macro-benchmarks. / Master of Science
|
9 |
Avaliação de desempenho do sistema de memória transacional de Clojure como biblioteca de sincronização na linguagem Java / Performance evaluation of Clojure transactional memory system as a synchronization library in Java languageCalcina Ccori, Pablo César 14 June 2011 (has links)
Neste trabalho apresenta-se uma avaliação do desempenho da implementação de memória transacional da linguagem Clojure, utilizada como biblioteca de sincronização para uso em conjunto com outras aplicações dentro da máquina virtual de Java. É implementada uma camada de interface entre as estruturas de dados de Clojure e o benchmark STMBench7 e são discutidos alguns aspectos que geram sobrecarga no desempenho. / In this work a performance evaluation of Clojure transactional memory implementation is presented, using it as a synchronization library to work together with other applications on Java virtual machine. It is implemented an interface layer between Clojure data structures and STMBench7 benchmark, and issues about overhead in performance are discussed.
|
10 |
On improving the ease of use of the software transactional memory abstraction / Faciliter l'utilisation des mémoires transactionnelles logiciellesCrain, Tyler 06 March 2013 (has links)
Les architectures multicœurs changent notre façon d'écrire des programmes. L'écriture de programmes concurrents est bien connue pour être difficile. Traditionnellement, l'utilisation de verrous (locks) permettant au code de s'exécuter en exclusion mutuelle, a été l'abstraction la plus largement utilisée pour l'écriture des programmes concurrents. Malheureusement, il est difficile d'écrire des programmes concurrents efficaces et corrects reposant sur des verrous. En outre, les verrous présentent d'autres problèmes, notamment celui du passage à l'échelle. Le concept de mémoire transactionnelle a été proposé comme une solution à ces difficultés. Les transactions peuvent être considérées comme une abstraction de haut niveau, ou une méthodologie pour l'écriture de programmes concurrents, ce qui permet au programmeur de pouvoir déclarer des sections de code devant être exécutés de façon atomique, sans avoir à se soucier des détails de synchronisation. Malheureusement, bien qu'assurément plus facile à utiliser que les verrous, la mémoire transactionnelle souffre encore de problèmes de performance et de facilité d'utilisation. En fait, de nombreux concepts relatifs à l'utilisation et à la sémantique des transactions n'ont pas encore des normes convenues. Cette thèse propose de nouvelles solutions permettant de faciliter l'utilisation des mémoires transactionellles. La thèse débute par un chapitre qui donne un bref aperçu de la mémoire transactionnelle logicielle (STM) ainsi qu'une discussion sur le problème de la facilité d'utilisation. Les contributions à la recherche sont ensuite divisées en quatre chapitres principaux, chacun proposant une approche différente afin de rendre les STMs plus facile à utiliser. / Multicore architectures are changing the way we write programs. Writing concurrent programs is well known to be difficult task. Traditionally, the use of locks allowing code to execute in mutual exclusion has been the most widely used abstraction to write concurrent programs. Unfortunately, using locks it is difficult to write correct concurrent programs that perform efficiently. Additionally, locks present other problems such as scalability issues. Transactional memory has been proposed as a possible promising solution to these difficulties of writing concurrent programs. Transactions can be viewed as a high level abstraction or methodology for writing concurrent programs, allowing the programmer to be able to declare what sections of his code should be executed atomically, without having to worry about synchronization details. Unfortunately, although arguably easier to use then locks, transactional memory still suffers from performance and ease of use problems. In fact many concepts surrounding the usage and semantics of transactions have no widely agreed upon standards. This thesis specifically focuses on these ease of use problems by discussing how previous research has dealt with them and proposing new solutions putting ease of use first. The thesis starts with a chapter giving a brief overview of software transactional memory (STM) as well as a discussion of the problem of ease of use that is focused on in the later chapters. The research contributions are then divided into four main chapters, each looking at different approaches working towards making transactional memory easier to use.
|
Page generated in 0.1054 seconds