Global ETD Search

71	Software Transactional Memory Building Blocks Riegel, Torvald 13 May 2013 (has links) Exploiting thread-level parallelism has become a part of mainstream programming in recent years. Many approaches to parallelization require threads executing in parallel to also synchronize occassionally (i.e., coordinate concurrent accesses to shared state). Transactional Memory (TM) is a programming abstraction that provides the concept of database transactions in the context of programming languages such as C/C++. This allows programmers to only declare which pieces of a program synchronize without requiring them to actually implement synchronization and tune its performance, which in turn makes TM typically easier to use than other abstractions such as locks. I have investigated and implemented the building blocks that are required for a high-performance, practical, and realistic TM. They host several novel algorithms and optimizations for TM implementations, both for current hardware and future hardware extensions for TM, and are being used in or have influenced commercial TM implementations such as the TM support in GCC. info:eu-repo/classification/ddc/004 ddc:004
72	Autonomic Thread Parallelism and Mapping Control for Software Transactional Memory / Contrôle autonomique du parallélisme et du placement de threads pour les mémoires transactionnelles logicielles Zhou, Naweiluo 19 October 2016 (has links) L’exécution de programmes paralléles demande à établir un compromis entre le temps de calcul (nombre de threads) et le temps de synchronisation. Ce compromis dépend principalement du nombre de threads actifs. Un haut degré de parallélisme (beaucoup de threads) permet généralement de diminuer le temps de calcul, mais peut aussi avoir pour conséquence d’augmenter les surcoûts de synchronisation entre threads. De plus, le placement des threads sur les cœurs peut impacter les performances du programme, car le temps pour accéder aux données en mémoire peut varier d’un cœur à l’autre en raison de la contention sur la la hiérarchie mémoire. Ainsi, la performance d’un programme peut être améliorée en adaptant le nombre de threads actifs et en plaçant correctement les threads sur les cœurs de calcul. Cependant, il n’existe pas de règle universelle permettant de décider a priori du niveau de parallélisme optimal et du placement de threads d’un programme, en particulier pour un programme avec les changemets de comportement dynamique. D’ailleurs, un paramétrage hors ligne est moins précis. Cette thèse présente un travail sur la gestion dynamique du parallélisme et du placement de threads. Cette thèse s’attaque au problème de gestion de threads utilisant de la mémoire transactionnelle logicielle (Software Transactional Memory, STM). La mémoire transactionnelle logicielle constitue une technique prometteuse pour traiter le problème de synchronisation en évitant les verrous.Le concept de calcul autonomique offre aux programmeurs un cadre de méthodeset techniques pour construire des systèmes auto-adaptatifs ayant un comportementmaîtrisé. L’idée clé est d’implémenter des boucles de rétroaction afin de concevoir des contrôleurs sûrs, efficaces et prédictibles, permettant d’observer et d’ajuster de manière dynamique les systèmes contrôlés, tout en minimisant le surcoût d’une telle méthode. La thèse propose de concevoir des boucles de rétroaction afin d’automatiser le gestion de threads à l’exécution avec comme objectif la réduction du temps d’exécution des programmes. / Parallel programs need to manage the trade-off between the time spent in synchronisation and computation. The trade-off is significantly affected by the number of active threads. High parallelism may decrease computing time while increase synchronisation cost. Furthermore, thread placement on different cores may impact on program performance, as the data access time can vary from one core to another due to intricacies of its underlying memory architecture. Therefore, the performance of a program can be improved by adjusting its parallelism degree and the mapping of its threads to physical cores. Alas, there is no universal rule to decide them for a program from an offline view, especially for a program with online behaviour variation. Moreover, offline tuning is less precise. This thesis presents work on dynamical management of parallelism and thread placement. It addresses multithread issues via Software Transactional Memory (STM). STM has emerged as a promising technique, which bypasses locks, to tackle synchronisation through transactions. Autonomic computing offers designers a framework of methods and techniques to build autonomic systems with well-mastered behaviours. Its key idea is to implement feedback control loops to design safe, efficient and predictable controllers, which enable monitoring and adjusting controlled systems dynamically while keeping overhead low. This dissertation proposes feedback control loops to automate management of threads at runtime and diminish program execution time. Systèmes autonomiques Mémoire transactionnelle Contrlôle par rétroaction Synchronisation Adaptation du parallèlisme Affinité des threads Autonomic Transactional memory Feedback control Synchronisation Parallelism adaptation Thread affinity 004
73	Soporte arquitectónico a la sincronización imparcial de lectores y escritores en computadores paralelos Vallejo Gutiérrez, Enrique 10 June 2010 (has links) La evolución tecnológica en el diseño de microprocesadores ha conducido a sistemas paralelos con múltiples hilos de ejecución. Estos sistemas son más difíciles de programar y presentan overheads mayores que los sistemas uniprocesadores tradicionales, que pueden limitar su rendimiento y escalabilidad: sincronización, coherencia, consistencia y otros mecanismos requeridos para garantizar una ejecución correcta. La programación paralela tradicional se basa en primitivas de sincronización como barreras y locks de lectura/escritura, con alta tendencia a fallos de programación. La Memoria Transaccional (TM) oculta estos problemas de sincronización al programador; sin embargo, múltiples sistemas TM aún se basan en locks, y se beneficiarían de una implementación eficiente de los mismos.Esta tesis presenta nuevas técnicas hardware para acelerar la ejecución de estos programas paralelos. Proponemos un sistema TM híbrido basado en locks de lectura/escritura, que minimiza los overheads del software cuando la aceleración hardware está presente. Desarrollamos un mecanismo para garantizar fairness entre transacciones hardware y software. Introducimos un mecanismo distribuido de aceleración de locks de lectura/escritura, llamado Lock Control Unit. Finalmente, proponemos una organización de multiprocesadores basadas en Kilo-Instruction Processors que garantiza Consistencia Secuencial y permite especulación en secciones críticas. / Technological evolution in microprocessor design has led to parallel systems with multiple execution threads. These systems are more difficult to program and present higher performance overheads than the traditional uniprocessor systems, what may limit their performance and scalability: synchronization, coherence, consistency and other mechanisms required to guarantee a correct execution. Traditional parallel programming is based on synchronization primitives such as barriers, critical sections and reader/writer locks, highly prone to programming errors. Transactional Memory (TM) removes the synchronization problems from the programmer. However, many TM systems still rely on reader/writer locks, and would get benefited from an efficient implementation.This thesis presents new hardware techniques to accelerate the execution of such parallel programs. We propose a Hybrid TM system based on reader/writer locks, which minimizes the software overheads when acceleration hardware is present, still allowing for correct software-only execution. We propose a mechanism to guarantee fairness between hardware and software transactions is provided. We introduce a low-cost distributed mechanism named the Lock Control Unit to handle fine-grain reader-writer locks. Finally, we propose an organization of a mutiprocessor based on Kilo-Instruction Processors, which guarantees Sequential Consistency while allowing for speculation in critical sections. Implicit Transactions Kilo-Instruction Processors Locks Lock Control Unit Transactional Memory Parallel Computing Reader/writer synchronization 004 62 621.3
74	Operating system transactions Porter, Donald E. 26 January 2011 (has links) Applications must be able to synchronize accesses to operating system (OS) resources in order to ensure correctness in the face of concurrency and system failures. This thesis proposes system transactions, with which the programmer specifies atomic updates to heterogeneous system resources and the OS guarantees atomicity, consistency, isolation, and durability (ACID). This thesis provides a model for system transactions as a concurrency control mechanism. System transactions efficiently and cleanly solve long-standing concurrency problems that are difficult to address with other techniques. For example, malicious users can exploit race conditions between distinct system calls in privileged applications, gaining administrative access to a system. Programmers can eliminate these vulnerabilities by eliminating these race conditions with system transactions. Similarly, failed software installations can leave a system unusable. System transactions can roll back an unsuccessful software installation without disturbing concurrent, independent updates to the file system. This thesis describes the design and implementation of TxOS, a variant of Linux 2.6.22 that implements system transactions. The thesis contributes new implementation techniques that yield fast, serializable transactions with strong isolation and fairness between system transactions and non-transactional activity. Using system transactions, programmers can build applications with better performance or stronger correctness guarantees from simpler code. For instance, wrapping an installation of OpenSSH in a system transaction guarantees that a failed installation will be rolled back completely. These atomicity properties are provided by the OS, requiring no modification to the installer itself and adding only 10% performance overhead. The prototype implementation of system transactions also minimizes non-transactional overheads. For instance, a non-transactional compilation of Linux incurs negligible (less than 2%) overhead on TxOS. Finally, this thesis describes a new lock-free linked list algorithm, called OLF, for optimistic, lock-free lists. OLF addresses key limitations of prior algorithms, which sacrifice functionality for performance. Prior lock-free list algorithms can safely insert or delete a single item, but cannot atomically compose multiple operations (e.g., atomically move an item between two lists). OLF provides both arbitrary composition of list operations as well as performance scalability close to previous lock-free list designs. OLF also removes previous requirements for dynamic memory allocation and garbage collection of list nodes, making it suitable for low-level system software, such as the Linux kernel. We replace lists in the Linux kernel's directory cache with OLF lists, which currently requires a coarse-grained lock to ensure invariants across multiple lists. OLF lists in the Linux kernel improve performance of a filesystem metadata microbenchmark by 3x over unmodified Linux at 8 CPUs. The TxOS prototype demonstrates that a mature OS running on commodity hardware can provide system transactions at a reasonable performance cost. As a practical OS abstraction for application developers, system transactions facilitate writing correct application code in the presence of concurrency and system failures. The OLF algorithm demonstrates that application developers can have both the functionality of locks and the performance scalability of a lock-free linked list. / text Transactions Operating systems Concurrency TxOS Race conditions Transactional memory Lock-free data structures System transactions Linked list algorithms Lock-free algorithm Lock-free lists
75	Types for Correct Concurrent API Usage Beckman, Nels E. 01 December 2010 (has links) This thesis represents an attempt to improve the state of the art in our ability tounderstand and check object protocols, with a particular emphasis on concurrent pro-grams. Object protocols are the patterns of use imposed on clients of APIs in object-oriented programs. We show through an empirical study of open-source object-oriented programs that object protocols are quite common. We then present “Sync-or-Swim,” a methodology and suite of accompanying tools for checking at compile-time that object protocols are used and implemented correctly. This methodology isbased upon the existing access permissions method of alias control, which is hereextended to be sound in the face of shared-memory concurrency. The analysis isformalized as a type system for an object-oriented calculus, and then proven to befree from false-negatives using a proof of type safety. The type system is extendedwith parametric polymorphism, or “generics,” in order to increase its ability to checkcommonly occurring patterns. An implementation of the approach, a static analysisfor programs written in the Java programming language, is presented. This imple-mentation was used to perform a series of case studies whose goal was to evaluatethe ease of use, expressiveness and ability to verify commonly occurring patterns.These case studies are presented. Next, an approach and an associated tool for in-ferring access permission annotations is presented. This inference tool can reducethe burden of using our protocol-checking approach by automatically inferring therequired typing annotations. This inference is built upon a system of probabilisticconstraints, which allows the easy encoding of heuristics. Finally, an optimization ofsoftware transactional memory runtimes is presented. This optimization is enabledby the typing annotations required to use the concurrent protocol checker and canremove some of the overhead typically associated with transactional memory sys-tems. As a result of the work presented in this thesis, it is possible to guarantee theabsence of certain API usage errors even in concurrent programs, and to do so witha low burden on programmers. By adhering to such an approach, programmers canproduce more reliable software. API object protocol typestate concurrency multi-threading object-oriented type theory type system static analysis specification verification Java empirical polymor- phism inference probabilistic transactional memory STM optimization
76	Um modelo de memória transacional para arquiteturas heterogêneas baseado em software Cache / A transactional memory model for heterogeneous architectures based in Software Cache Goldstein, Felipe Portavales 17 August 2018 (has links) Orientador: Rodolfo Jardim de Azevedo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-17T02:02:14Z (GMT). No. of bitstreams: 1 Goldstein_FelipePortavales_M.pdf: 2303926 bytes, checksum: c44512059a990654552904a0f94d74f2 (MD5) Previous issue date: 2010 / Resumo: A adoção de processadores com múltiplos núcleos pela indústria, levou à necessidade de novas técnicas para facilitar a programação de software paralelo. A técnica chamada memórias transacionais é uma das mais promissoras. Esta técnica é capaz de executar tarefas concorrentemente de forma otimista, o que permite um bom desempenho. Outra vantagem é que a sua utilização é muito mais simples comparada com a técnica clássica de exclusão mútua. Neste trabalho é proposto o primeiro modelo de memória transacional para arquiteturas híbridas, neste caso a arquitetura alvo é o processador Cell BE. O processador Cell BE é especialmente complexo por causa das dificuldades que a arquitetura deste processador impõe ao programador quando se necessita acessar a memória global compartilhada. O modelo proposto age como uma camada entre o programa e a memória principal, permitindo um acesso transparente aos dados, garantindo coerência e realizando o controle de concorrência de forma automática. O modelo proposto utiliza Software Cache combinado com a memória transacional para facilitar o acesso à memória externa a partir dos SPEs. Ele foi implementado e testado utilizando 8 aplicativos benchmark diferentes, mostrando sua viabilidade para casos de uso reais. Foi feita uma análise detalhada de cada parte da arquitetura proposta com relação ao impacto no desempenho geral do sistema. Este modelo foi capaz de obter um desempenho até duas vezes superior à implementação utilizando um mutex global. As vantagens da utilização se concentram principalmente na facilidade de uso, garantias de coerência e por evitar alguns tipos de bugs que seriam comuns em uma implementação com mutex, como por exemplo dead-locks. Este trabalho obteve o prêmio de melhor artigo no SBAC-PAD 2008 / Abstract: The adoption of multi-core processors by the industry has pushed towards the development of new techniques to simplify programming parallel software. The technique called transactional memories is one of the most promising. This technique is able to execute multiple tasks concurrently in an optimistic way to achieve a better performance. Another advantage is that the usage of this technique is simpler than the classic mutual exclusion. This work proposes the first transactional memory model for hybrid architectures, in this case the target architecture is the Cell BE processor. The Cell BE is specially complex because of the dificulties when acessing the main shared memory from one of the SPEs. The proposed model acts as a layer between the program running and the main shared memory, allowing transparent access to the data, guaranteeing coherency and automatic concurrency control. The proposed model uses a Software Cache combined with a transactional memory to facilitate the acess to the main memory from the SPEs. This model was implemented and tested using 8 benchmark applications, showing its feasability in real use cases. A detailed analysis of its internal parts has been made to show the impact of each part in the overal system performance. The model was able to achieve a performance up to two times better than a similar implementation using a global mutex. The advantages of this model rely on its usability, coherency guaranty and because it is able to avoid concurrency programming bugs such as dead-lock, which are common in a mutex implementation. This work won the best paper award at SBAC-PAD 2008 / Mestrado / Arquitetura de Computadores / Mestre em Ciência da Computação Memória cache Memória hierárquica (Computação) Processadores multicore Arquitetura de computador Transactional memory Cache memory Hierarchical memory (Computer science) Multicore processors Computer architecture
77	LUTS : a Light-Weight User-Level Transaction Scheduler / LUTS : a Light-Weight User-Level Transaction Scheduler Nicácio, Daniel Henricus de Knegt Dutra, 1984- 22 August 2018 (has links) Orientador: Guido Costa Souza de Araújo / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-22T08:27:32Z (GMT). No. of bitstreams: 1 Nicacio_DanielHenricusdeKnegtDutra_D.pdf: 2579331 bytes, checksum: b8e15a6f91203b98455f39d63d63a634 (MD5) Previous issue date: 2012 / Resumo: Sistemas de Memória Transacional em Software (MTS) têm sido usados como uma abordagem para melhorar o desempenho ao permitir a execução concorrente de blocos atômicos. Porém, em cenários com alta contenção, sistemas baseados em MTS podem diminuir o desempenho consideravelmente, já que a taxa de conflitos aumenta. Políticas de gerenciamento de contenção têm sido usadas como uma forma de selecionar qual transação abortar quando um conflito ocorre. No geral, gerenciadores de contenção não são capazes de evitar conflitos, tendo em vista que eles apenas selecionam qual transação abortar e o momento em que ela deve reiniciar. Como gerenciadores de contenção agem somente após a detecção de um conflito, é difícil aumentar a taxa de transações finalizadas com sucesso. Abordagens mais pró-ativas foram propostas, focando na previsão de quando uma transação deve abortar e atrasando o início de sua execução. Contudo, as técnicas pró-ativas continuam sendo limitadas, já que elas não substituem a transação fadada a abortar por outra transação com melhores probabilidades de sucesso, ou quando o fazem, dependem do sistema operacional para essa tarefa, tendo pouco ou nenhum controle de qual transação será a substituta. Esta tese apresenta o LUTS, Lightweight User-Level Transaction Scheduler, um escalonador de transação de baixo custo em nível de usuário. Diferente de outras técnicas, LUTS provê maneiras de selecionar outra transação a ser executada em paralelo, melhorando o desempenho do sistema. Nós discutimos o projeto do LUTS e propomos uma heurística dinâmica, com o objetivo de evitar conflitos, que foi construída utilizando os métodos disponibilizados pelo LUTS. Resultados experimentais, conduzidos com os conjuntos de aplicações STAMP e STMBench7, e executando nas bibliotecas TinySTM e SwissTM, mostram como nossa heurística para evitar conflitos pode melhorar efetivamente o desempenho de sistema de MTS em aplicações com alta contenção / Abstract: Software Transaction Memory (STM) systems have been used as an approach to improve performance, by allowing the concurrent execution of atomic blocks. However, under high-contention workloads, STM-based systems can considerably degrade performance, as transaction conflict rate increases. Contention management policies have been used as a way to select which transaction to abort when a conflict occurs. In general, contention managers are not capable of avoiding conflicts, as they can only select which transaction to abort and the moment it should restart. Since contention manager's act only after a conflict is detected, it becomes harder to effectively increase transaction throughput. More proactive approaches have emerged, aiming at predicting when a transaction is likely to abort, postponing its execution. Nevertheless, most of the proposed proactive techniques are limited, as they do not replace the doomed transaction by another or, when they do, they rely on the operating system for that, having little or no control on which transaction to run. This article proposes LUTS, a Lightweight User-Level Transaction Scheduler. Unlike other techniques, LUTS provides the means for selecting another transaction to run in parallel, thus improving system throughput. We discuss LUTS design and propose a dynamic conflict-avoidance heuristic built around its scheduling capabilities. Experimental results, conducted with the STAMP and STMBench7 benchmark suites, running on TinySTM and SwissTM, show how our conflict-avoidance heuristic can effectively improve STM performance on high contention applications / Doutorado / Ciência da Computação / Doutor em Ciência da Computação Memória transacional Programação paralela (Computação) Arquitetura de computador Transactional memory Parallel programming (Computer science) Computer architecture
78	Scheduling and serialization techniques for transactional memories / Técnicas de escalonamento e serialização para memórias transacionais Pereira, Marcio Machado, 1959- 03 February 2015 (has links) Orientadores: Guido Costa Souza de Araújo, José Nelson Amaral / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-27T10:12:59Z (GMT). No. of bitstreams: 1 Pereira_MarcioMachado_D.pdf: 2922376 bytes, checksum: 9775914667eadf354d7e256fb2835859 (MD5) Previous issue date: 2015 / Resumo: Nos últimos anos, Memórias Transacionais (Transactional Memories ¿ TMs) têm-se mostrado um modelo de programação paralela que combina, de forma eficaz, a melhoria de desempenho com a facilidade de programação. Além disso, a recente introdução de extensões para suporte a TM por grandes fabricantes de microprocessadores, também parece endossá-la como um modelo de programação para aplicações paralelas. Uma das questões centrais na concepção de sistemas de TM em Software (STM) é identificar mecanismos ou heurísticas que possam minimizar a contenção decorrente dos conflitos entre transações. Apesar de já terem sido propostos vários mecanismos para reduzir a contenção, essas técnicas têm um alcance limitado, uma vez que o conflito é evitado por interrupção ou serialização da execução da transação, impactando consideravelmente o desempenho do programa. Este trabalho explora uma abordagem complementar para melhorar o desempenho de STM através da utilização de escalonadores. Um escalonador de TM é um componente de software que decide quando uma determinada transação deve ser executada ou não. Sua eficácia é muito sensível às métricas usadas para prever o comportamento das transações, especialmente em cenários de alta contenção. Este trabalho propõe um novo escalonador, Dynamic Transaction Scheduler ¿ DTS, para selecionar a próxima transação a ser executada. DTS é baseada em uma política de "recompensa pelo sucesso" e utiliza uma métrica que mede com melhor precisão o trabalho realizado por uma transação. Memórias Transacionais em Hardware (HTMs) são mecanismos interessante para implementar TM porque integram o suporte a transações no nível da arquitetura. Por outro lado, aplicações que usam HTM podem ter o seu desempenho dificultado pela falta de escalabilidade e transbordamento da cache de dados. Este trabalho apresenta um extenso estudo de desempenho de aplicações que usam HTM na arquitetura Haswell da Intel. Ele avalia os pontos fortes e fracos desta nova arquitetura, realizando uma exploração das várias características das aplicações de TM. Este estudo detalhado revela as restrições impostas pela nova arquitetura e introduz uma política de serialização simples, porém eficaz, para garantir o progresso das transações, além de proporcionar melhor desempenho / Abstract: In the last few years, Transactional Memories (TMs) have been shown to be a parallel programming model that can effectively combine performance improvement with ease of programming. Moreover, the recent introduction of (H)TM-based ISA extensions, by major microprocessor manufacturers, also seems to endorse TM as a programming model for today¿s parallel applications. One of the central issues in designing Software TM (STM) systems is to identify mechanisms or heuristics that can minimize contention arising from conflicting transactions. Although a number of mechanisms have been proposed to tackle contention, such techniques have a limited scope, because conflict is avoided by either interrupting or serializing transaction execution, thus considerably impacting performance. This work explores a complementary approach to boost the performance of STM through the use of schedulers. A TM scheduler is a software component that decides when a particular transaction should be executed. Their effectiveness is very sensitive to the accuracy of the metrics used to predict transaction behaviour, particularly in high-contention scenarios. This work proposes a new Dynamic Transaction Scheduler ¿ DTS to select a transaction to execute next, based on a new policy that rewards success and an improved metric that measures the amount of effective work performed by a transaction. Hardware TMs (HTM) are an interesting mechanism to implement TM as they integrate the support for transactions at the lowest, most efficient, architectural level. On the other hand, for some applications, HTMs can have their performance hindered by the lack of scalability and by limitations in cache store capacity. This work presents an extensive performance study of the implementation of HTM in the Haswell generation of Intel x86 core processors. It evaluates the strengths and weaknesses of this new architecture by exploring several dimensions in the space of TM application characteristics. This detailed performance study provides insights on the constraints imposed by the Intel¿s Transaction Synchronization Extension (Intel¿s TSX) and introduces a simple, but efficient, serialization policy for guaranteeing forward progress on top of the best-effort Intel¿s HTM which was critical to achieving performance / Doutorado / Ciência da Computação / Doutor em Ciência da Computação Memória transacional Programação paralela (Computação) Processamento paralelo (Computadores) Programação (Computadores) Transactional memory Parallel programming (Computer science) Computer programming
79	Tools and Techniques for Efficient Transactions Poudel, Pavan 07 September 2021 (has links) No description available. Computer Science transactional memory persistent memory versioning concurrency conflicts switching distributed system scheduling execution time communication cost data-flow control-flow predefined order dynamic algorithm
80	Exploiting Speculative and Asymmetric Execution on Multicore Architectures Wamhoff, Jons-Tobias 21 November 2014 (has links) The design of microprocessors is undergoing radical changes that affect the performance and reliability of hardware and will have a high impact on software development. Future systems will depend on a deep collaboration between software and hardware to cope with the current and predicted system design challenges. Instead of higher frequencies, the number of processor cores per chip is growing. Eventually, processors will be composed of cores that run at different speeds or support specialized features to accelerate critical portions of an application. Performance improvements of software will only result from increasing parallelism and introducing asymmetric processing. At the same time, substantial enhancements in the energy efficiency of hardware are required to make use of the increasing transistor density. Unfortunately, the downscaling of transistor size and power will degrade the reliability of the hardware, which must be compensated by software. In this thesis, we present new algorithms and tools that exploit speculative and asymmetric execution to address the performance and reliability challenges of multicore architectures. Our solutions facilitate both the assimilation of software to the changing hardware properties as well as the adjustment of hardware to the software it executes. We use speculation based on transactional memory to improve the synchronization of multi-threaded applications. We show that shared memory synchronization must not only be scalable to large numbers of cores but also robust such that it can guarantee progress in the presence of hardware faults. Therefore, we streamline transactional memory for a better throughput and add fault tolerance mechanisms with a reduced overhead by speculating optimistically on an error-free execution. If hardware faults are present, they can manifest either in a single event upset or crashes and misbehavior of threads. We address the former by applying transactions to checkpoint and replicate the state such that threads can correct and continue their execution. The latter is tackled by extending the synchronization such that it can tolerate crashes and misbehavior of other threads. We improve the efficiency of transactional memory by enabling a lightweight thread that always wins conflicts and significantly reduces the overheads. Further performance gains are possible by exploiting the asymmetric properties of applications. We introduce an asymmetric instrumentation of transactional code paths to enable applications to adapt to the underlying hardware. With explicit frequency control of individual cores, we show how applications can expose their possibly asymmetric computing demand and dynamically adjust the hardware to make a more efficient usage of the available resources. info:eu-repo/classification/ddc/004 ddc:004

Search results