Global ETD Search

1	On Improving Distributed Transactional Memory through Nesting, Partitioning and Ordering Turcu, Alexandru 03 March 2015 (has links) Distributed Transactional Memory (DTM) is an emerging, alternative concurrency control model that aims to overcome the challenges of distributed-lock based synchronization. DTM employs transactions in order to guarantee consistency in a concurrent execution. When two or more transactions conflict, all but one need to be delayed or rolled back. Transactional Memory supports code composability by nesting transactions. Nesting how- ever can be used as a strategy to improve performance. The closed nesting model enables partial rollback by allowing a sub-transaction to abort without aborting its parent, thus reducing the amount of work that needs to be retried. In the open nesting model, sub- transactions can commit to the shared state independently of their parents. This reduces isolation and increases concurrency. Our first main contribution in this dissertation are two extensions to the existing Transac- tional Forwarding Algorithm (TFA). Our extensions are N-TFA and TFA-ON, and support closed nesting and open nesting, respectively. We additionally extend the existing SCORe algorithm with support for open nesting (we call the result SCORe-ON). We implement these algorithms in a Java DTM framework and evaluate them. This represents the first study of transaction nesting in the context of DTM, and contributes the first DTM implementation which supports closed nesting or open nesting. Closed nesting through our N-TFA implementation proved insufficient for any significant throughput improvements. It ran on average 2% faster than flat nesting, while performance for individual tests varied between 42% slowdown and 84% speedup. The workloads that benefit most from closed nesting are characterized by short transactions, with between two and five sub-transactions. Open nesting, as exemplified by our TFA-ON and SCORe-ON implementations, showed promising results. We determined performance improvement to be a trade-off between the overhead of additional commits and the fundamental conflict rate. For write-intensive, high- conflict workloads, open nesting may not be appropriate, and we observed a maximum speedup of 30%. On the other hand, for lower fundamental-conflict workloads, open nesting enabled speedups of up to 167% in our tests. In addition to the two nesting algorithms, we also develop Hyflow2, a high-performance DTM framework for the Java Virtual Machine, written in Scala. It has a clean Scala API and a compatibility Java API. Hyflow2 was on average two times faster than Hyflow on high-contention workloads, and up to 16 times faster in low-contention workloads. Our second main contribution for improving DTM performance is automated data partition- ing. Modern transactional processing systems need to be fast and scalable, but this means many such systems settled for weak consistency models. It is however possible to achieve all of strong consistency, high scalability and high performance, by using fine-grained partitions and light-weight concurrency control that avoids superfluous synchronization and other over- heads such as lock management. Independent transactions are one such mechanism, that rely on good partitions and appropriately defined transactions. On the downside, it is not usually straightforward to determine optimal partitioning schemes, especially when dealing with non-trivial amounts of data. Our work attempts to solve this problem by automating the partitioning process, choosing the correct transactional primitive, and routing transactions appropriately. Our third main contribution is Alvin, a system for managing concurrently running trans- actions on a geographically replicated data-store. Alvin supports general-purpose transactions, and guarantees strong consistency criteria. Through a novel partial order broadcast protocol, Alvin maximizes the parallelism of ordering and local transaction processing, resulting in low client-perceived latency. Alvin can process read-only transactions either lo- cally or globally, according to the desired consistency criterion. Conflicting transactions are ordered across all sites. We built Alvin in the Go programming language. We conducted our evaluation study on Amazon EC2 infrastructure and compared against Paxos- and EPaxos- based state machine replication protocols. Our results reveal that Alvin provides significant speed-up for read-dominated TPC-C workloads: as much as 4.8x when compared to EPaxos on 7 datacenters, and up to 26% in write-intensive workloads. Our fourth and final contribution is M2Paxos, a multi-leader implementation of Generalized Consensus. Single leader-based consensus protocols are known to stop scaling once the leader reaches its saturation point. Ordering commands based on conflicts is appealing due to the potentially higher parallelism, but is imperfect due to the higher quorum sizes required for fast decisions and the need to compare commands and track their dependencies. M2Paxos on the other hand exploits fast decisions (i.e., delivery of a command in two communication delays) by leveraging a classic quorum size, matching a majority of nodes deployed. M2Paxos does not establish command dependencies based on conflicts, but it binds accessed objects to nodes, making sure commands operating on the same object will be ordered by the same node. Our evaluation study of M2Paxos (also built in Go) confirms the effectiveness of this approach, getting up to 7⨉ improvements in performance over state- of-the-art consensus and generalized consensus algorithms. / Ph. D. Distributed Transactional Memory Distributed Systems Nested Transactions Automated Partitioning Consensus
2	Scheduling Memory Transactions in Distributed Systems Kim, Junwhan 15 October 2013 (has links) Distributed transactional memory (DTM) is an emerging, alternative concurrency control model that promises to alleviate the difficulties of lock-based distributed synchronization. In DTM, transactional conflicts are traditionally resolved by a contention manager. A complementary approach for handling conflicts is through a transactional scheduler, which orders transactional requests to avoid or minimize conflicts. We present a suite of transactional schedulers: Bi-interval, Commutative Requests First (CRF), Reactive Transactional Scheduler (RTS), Dependency-Aware Transactional Scheduler} (DATS), Scheduling-based Parallel Nesting} (SPN), Cluster-based Transactional Scheduler} (CTS), and Locality-aware Transactional Scheduler} (LTS). The schedulers consider Herlihy and Sun's dataflow execution model, where transactions are immobile and objects are migrated to invoking transactions, relying on directory-based cache-coherence protocols to locate and move objects. Within this execution model, the proposed schedulers target different DTM models. Bi-interval considers the single object copy DTM model, and categorizes concurrent requests into read and write intervals to maximize the concurrency of read transactions. This allows an object to be simultaneously sent to read transactions, improving transactional makespan. We show that Bi-interval improves the makespan competitive ratio of DTM without such a scheduler to O(log(N)) for the worst-case and (log(N - k) for the average-case, for N nodes and k read transactions. Our implementation reveals that Bi-interval enhances transactional throughput over the no-scheduler case by as much as 1.71x, on average. CRF considers multi-versioned DTM. Traditional multi-versioned TM models use multiple object versions to guarantee commits of read transactions, but limit concurrency of write transactions. CRF relies on the notion of commutative transactions, i.e., those that ensure consistency of the shared data-set even when they are validated and committed concurrently. CRF detects conflicts between commutative and non-commutative write transactions and then schedules them according to the execution state, enhancing the concurrency of write transactions. Our implementation shows that transactional throughput is improved by up to 5x over a state-of-the-art competitor (DecentSTM). RTS and DATS consider transactional nesting in DTM, and focus on the closed and open nesting models, respectively. RTS determines whether a conflicting outer transaction must be aborted or enqueued according to the level of contention. If a transaction is enqueued, its closed-nested transactions do not have to retrieve objects again, resulting in reduced communication delays. DATS's goal is to boost the throughput of open-nested transactions by reducing the overhead of running expensive compensating actions and acquiring/releasing abstract locks when the outer transaction aborts. The contribution of DATS is twofold. First, it allows commutable outer transactions to be validated concurrently and allows non-commutable outer transactions -- depending on their inner transactions -- to be committed before others without dependencies. Implementations reveal effectiveness: RTS and DATS improve throughput (over the no-scheduler case), by as much as 1.88x and 2.2x, respectively. SPN considers parallel nested transactions in DTM. The idea of parallel nesting is to execute the inner transactions that access different objects concurrently, and execute the inner transactions that access the same objects serially, increasing performance. However, the parallel nesting model may be ineffective if all inner transactions access the same object due to the additional overheads needed to identify both types of inner transactions. SPN avoids this overhead and allows inner transactions to request objects and to execute them in parallel. Implementations reveal that SPN outperforms non-parallel nesting (i.e., closed nesting) by up to 3.5x and 4.5x on a micro-benchmark (bank) and the TPC-C transactional benchmark, respectively. CTS considers the replicated DTM model: object replicas are distributed across clusters of nodes, where clusters are determined based on inter-node distance, to maximize locality and fault-tolerance, and to minimize memory usage and communication overhead. CTS enqueues transactions that are aborted due to early validation over clusters and assigns their backoff times, reducing communication overhead. Implementation reveals that CTS improves throughput over competitor replicated DTM solutions including GenRSTM and DecentSTM by as much as 1.64x, on average. LTS considers the genuine partial replicated DTM model. In this model, LTS exploits locality by: 1) employing a transaction scheduler, which enables/disables object ownership changes depending on workload fluctuations, and 2) splitting hot-spot objects into multiple replicas for reducing contention. Our implementation reveals that LTS outperforms state-of-the-art competitors (Score and CTS) by up to 2.6x on micro-benchmarks (Linked List and Skip List) and by up to 2.2x on TPC-C. / Ph. D. Software Transactional Memory Distributed Systems Transactional Scheduling Partial Replication (Parallel) Nested Transactions

Search results

On Improving Distributed Transactional Memory through Nesting, Partitioning and Ordering

Scheduling Memory Transactions in Distributed Systems