Global ETD Search

1	Advances Towards Data-Race-Free Cache Coherence Through Data Classification Davari, Mahdad January 2017 (has links) Providing a consistent view of the shared memory based on precise and well-defined semantics—memory consistency model—has been an enabling factor in the widespread acceptance and commercial success of shared-memory architectures. Moreover, cache coherence protocols have been employed by the hardware to remove from the programmers the burden of dealing with the memory inconsistency that emerges in the presence of the private caches. The principle behind all such cache coherence protocols is to guarantee that consistent values are read from the private caches at all times. In its most stringent form, a cache coherence protocol eagerly enforces two invariants before each data modification: i) no other core has a copy of the data in its private caches, and ii) all other cores know where to receive the consistent data should they need the data later. Nevertheless, by partly transferring the responsibility for maintaining those invariants to the programmers, commercial multicores have adopted weaker memory consistency models, namely the Total Store Order (TSO), in order to optimize the performance for more common cases. Moreover, memory models with more relaxed invariants have been proposed based on the observation that more and more software is written in compliance with the Data-Race-Free (DRF) semantics. The semantics of DRF software can be leveraged by the hardware to infer when data in the private caches might be inconsistent. As a result, hardware ignores the inconsistent data and retrieves the consistent data from the shared memory. DRF semantics therefore removes from the hardware the burden of eagerly enforcing the strong consistency invariants before each data modification. Instead, consistency is guaranteed only when needed. This results in manifold optimizations, such as reducing the energy consumption and improving the performance and scalability. The efficiency of detecting and discarding the inconsistent data is an important factor affecting the efficiency of such coherence protocols. For instance, discarding the consistent data does not affect the correctness, but results in performance loss and increased energy consumption. In this thesis we show how data classification can be leveraged as an effective tool to simplify the cache coherence based on the DRF semantics. In particular, we introduce simple but efficient hardware-based private/shared data classification techniques that can be used to efficiently detect the inconsistent data, thus enabling low-overhead and scalable cache coherence solutions based on the DRF semantics. Shared Memory Architectures Multicore Memory Hierarchy Cache Coherence Data Classification Computer Systems Datorsystem
2	Data Replication in Hybrid Memory Database Systems Zarubin, Mikhail 15 March 2022 (has links) The recent advances in hardware technologies - i.e. highly scalable multi-core NUMA architectures and non-volatile random-access memory (NVRAM) - lead to significant changes in the architecture of in-memory database systems. The novel memory type allows persistent writes while featuring DRAM-like characteristics - byte addressability, high bandwidth, and low access latencies. It is likely to complement or replace the block-based secondary storage (e.g., HDDs or SSDs) for storing the primary data of the DBMS. Therefore, the next generation of highly-performant scalable database systems will rely on single-level hybrid memory (e.g., compound exclusively of DRAM and NVRAM) NUMA architectures and is expected to keep the primary data solely persistent in NVRAM, while query processing could be executed on both mediums. Unfortunately, NVRAM faces certain drawbacks such as a lower write endurance, lower bandwidth, higher latencies, and - most importantly - an increased error-proneness compared to DRAM. Thus, efficient minimal-overhead data protection mechanisms have to be deployed in the underlined architectures to avoid primary data losses. This thesis provides an analytical overview of such envisioned hybrid memory database systems, gives a survey of reliability techniques that are generally deployed in computing systems, identifies their strengths and weaknesses when used in hybrid memory databases. As a result, this work proposes effective adoption and optimization primitives for the software-managed data replication as the most applicable resilience approach. In particular, research focus is given to runtime and space (and, therefore, NVRAM wear-out) reduction of the replication overheads, while preserving strong resilience guaranties and instant recovery opportunities. Subsequently, this thesis proposes a rich set of techniques that leverage data replication for query processing needs to achieve high performance, allocation flexibility and effective hardware utilization in modern commodity scale-up systems. info:eu-repo/classification/ddc/004 ddc:004
3	Power-Aware Compilation Techniques For Embedded Systems Shyam, K 07 1900 (has links) The demand for devices like Personal Digital Assistants (PDA’s), Laptops, Smart Mobile Phones, are at an all time high. As the demand for these devices increases, so is the push to provide sophisticated functionalities in these devices. However energy consumption has become a major constraint in providing increased functionality for these devices. A majority of the applications meant for these devices are rich with multimedia content. In this thesis, we propose two approaches for compiler directed energy reduction, one targeting the memory subsystem and another the processor. The ﬁrst technique is a compiler directed optimization technique that reduces the energy consumption of the memory subsystem, for an oﬀ-chip partitioned memory archi- tecture, having multiple memory banks, and various low-power operating modes for each of these banks. We propose an eﬃcient layout of the data segment to reduce the number of simultaneously active memory banks, so that the other memory banks that are inactive can be put to low power modes to reduce the energy. We model this problem as a graph partitioning problem, and use well known heuristics to solve the same. We also propose a simple Integer Linear Programming (ILP) formulation for the above problem. Perfor- mance results indicate that our approach achieves an energy reduction of 20% compared to the base scheme, and a reduction of 8%-10% over a previously suggested method. Also, our results are well within the optimal results obtained by using ILP method. The second approach proposed in this thesis reduces the dynamic energy consumed by the processor using dynamic voltage and frequency scaling technique. Earlier works on dynamic voltage scaling focused mainly on performing voltage scaling when the CPU is waiting for memory subsystem or concentrated chieﬂy on loop nests and/or subroutine calls having suﬃcient number of dynamic instructions. We concentrate on coarser pro- gram regions and for the ﬁrst time uses program phase behavior for performing dynamic voltage scaling. We relate the Dynamic Voltage Scaling Problem to the Multiple Choice Knapsack Problem, and use well known heuristics to solve it eﬃciently. Also, we develop a simple Integer Linear Programming (ILP) problem formulation for this problem. Experi-mental evaluation on a set of media applications reveal that our heuristic method obtains 35-40% reduction in energy consumption on an average, with a negligible performance degradation. Further the energy consumed by our heuristic solution is within 1% the optimal solution obtained by the ILP approach. Electric Power Control Compilers Embedded Systems Computer Memory Architecture Voltages Dynamic Voltage Scaling Memory Architectures Integer Linear Programming (ILP) Computer Science

1

Page generated in 0.0613 seconds