Global ETD Search

1	POSEIDON: The First Safe and Scalable Persistent Memory Allocator Demeri, Anthony K. 20 May 2020 (has links) With the advent of byte-addressable Non-Volatile Memory (NVMM), the need for a safe, scalable and high-performing memory allocator is inevitable. A slow memory allocator can bottleneck the entire application stack, while an unsecure memory allocator can render underlying systems and applications inconsistent upon program bugs or system failure. Unlike DRAM-based memory allocators, it is indispensable for an NVMM allocator to guarantee its heap metadata safety from both internal and external errors. An effective NVMM memory allocator should be 1) safe 2) scalable and 3) high performing. Unfortunately, none of the existing persistent memory allocators achieve all three requisites; critically, we also note: the de-facto NVMM allocator, Intel's Persistent Memory Development Kit (PMDK), is vulnerable to silent data corruption and persistent memory leaks as result of a simple heap overflow. We closely investigate the existing defacto NVMM memory allocators, especially PMDK, to study their vulnerability to metadata corruption and reasons for poor performance and scalability. We propose Poseidon, which is safe, fast and scalable. The premise of Poseidon revolves around providing a user application with per-CPU sub-heaps for scalability, while managing the heap metadata in a segregated fashion and efficiently protecting the metadata using a scalable hardware-based protection scheme, Intel's Memory Protection Keys (MPK). We evaluate Poseidon with a wide array of microbenchmarks and real-world benchmarks, noting: Poseidon outperforms the state-of-art allocators by a significant margin, showing improved scalability and performance, while also guaranteeing metadata safety. / Master of Science / Since the dawn of time, civilization has revolved around effective communication. From smoke signals to telegraphs and beyond, communication has continued to be a cornerstone of successful societies. Today, communication and collaboration occur, daily, on a global scale, such that even sub-second units of time are critical to successful societal operation. Naturally, many forms of modern communication revolve around our digital systems, such as personal computers, email servers, and social networking database applications. There is, thus, a never-ending surge of digital system development, constantly striving toward increased performance. For some time, increasing a system's dynamic random-access memory, or DRAM, has been able to provide performance gains; unfortunately, due to thermal and power constraints, such an increase is no longer feasible. Additionally, loss of power on a DRAM system causes bothersome loss of data, since the memory storage is volatile to power loss. Now, we are on the advent of an entirely new physical memory system, termed non-volatile main memory (NVMM), which has near identical performance properties to DRAM, but is operational in much larger quantities, thus allowing increased overall system speed. Alas, such a system also imposes additional requirements upon software developers; since, for NVMM, all memory updates are permanent, such that a failed update can cause persistent memory corruption. Regrettably, the existing software standard, led by Intel's Persistent Memory Development Kit (PMDK), is both unsecure (allowing for permanent memory corruption, with ease), low performance, and a bottleneck for multicore systems. Here, we present a secure, high performing solution, termed Poseidon, which harnesses the full potential of NVMM. persistent memory memory allocator scalability security
2	Techniques for formal modelling and verification on dynamic memory allocators / Techniques de modélisation et de vérification formelles des allocateurs de mémoire dynamiques Fang, Bin 10 September 2018 (has links) Cette thèse est une contribution à la spécification et à la vérification formelles des allocateurs de mémoire dynamiques séquentiels (SDMA, en abrégé), qui sont des composants clés des systèmes d'exploitation ou de certaines bibliothèques logiciel. Les SDMA gèrent la partie tas de la mémoire des processus. Leurs implémentations utilisent à la fois des structures de données complexes et des opérations de bas niveau. Cette thèse se concentre sur les SDMA qui utilisent des structures de données de type liste pour gérer les blocs du tas disponibles pour l'allocation (SDMA à liste).La première partie de la thèse montre comment obtenir des spécifications formelles de SDMA à liste en utilisant une approche basée sur le raffinement. La thèse définit une hiérarchie de modèles classés par la relation de raffinement qui capture une grande variété de techniques et de politiques employées par le implémentations réelles de SDMA. Cette hiérarchie forme une théorie algorithmique pour les SDMA à liste et pourrait être étendue avec d'autres politiques. Les spécifications formelles sont écrites en Event-B et les raffinements ont été prouvés en utilisant la plateforme Rodin. La thèse étudie diverses applications des spécifications formelles obtenues: le test basé sur des modèles, la génération de code et la vérification.La deuxième partie de la thèse définit une technique de vérification basée sur l'interprétation abstraite. Cette technique peut inférer des invariants précis des implémentations existantes de SDMA. Pour cela, la thèse définit un domaine abstrait dont les valeurs representent des ensembles d'états du SDMA. Le domaine abstrait est basé sur un fragment de la logique de séparation, appelé SLMA. Ce fragment capture les propriétés liées à la forme et au contenu des structures de données utilisées par le SDMA pour gérer le tas. Le domaine abstrait est défini comme un produit spécifique d'un domaine abstrait pour graphes du tas avec un domaine abstrait pour des sequences finies d'adresses mémoire. Pour obtenir des valueurs abstraites compactes, la thèse propose une organisation hiérarchique des valeurs abstraites: un premier niveau abstrait la liste de tous les blocs mémoire, alors qu'un second niveau ne sélectionne que les blocs disponibles pour l’allocation. La thèse définit les transformateurs des valeurs abstraites qui capturent la sémantique des instructions utilisées dans les implémentations des SDMA. Un prototype d'implémentation de ce domaine abstrait a été utilisé pour analyser des implémentations simples de SDMA. / The first part of the thesis demonstrates how to obtain formal specifications of free-list SDMA using a refinement-based approach. The thesis defines a hierarchy of models ranked by the refinement relation that capture a large variety of techniques and policies employed by real-work SDMA. This hierarchy forms an algorithm theory for the free-list SDMA and could be extended with other policies. The formal specifications are written in Event-B and the refinements have been proved using the Rodin platform. The thesis investigates applications of the formal specifications obtained, such as model-based testing, code generation and verification.The second part of the thesis defines a technique for inferring precise invariants of existing implementations of SDMA based abstract interpretation. For this, the thesis defines an abstract domain representing sets of states of the SDMA. The abstract domain is based on a fragment of Separation Logic, called SLMA. This fragment captures properties related with the shape and the content of data structures used by the SDMA to manage the heap. The abstract domain is defined as a specific product of an abstract domain for heap shapes with an abstract domain for finite arrays of locations. To obtain compact elements of this abstract domain, the thesis proposes an hierarchical organisation of the abstract values: a first level abstracts the list of all chunks while a second level selects only the chunks available for allocation. The thesis defines transformers of the abstract values that soundly capture the semantics of statements used in SDMA implementations. A prototype implementation of this abstract domain has been used to analyse simple implementations of SDMA Raffinement modulaire Interprétation abstraite Allocateur de mémoire dynamique Modular refinement Abstract interpretation Dynamic memory allocator
3	Simple, safe, and efficient memory management using linear pointers Liu, Likai 22 January 2016 (has links) Efficient and safe memory management is a hard problem. Garbage collection promises automatic memory management but comes with the cost of increased memory footprint, reduced parallelism in multi-threaded programs, unpredictable pause time, and intricate tuning parameters balancing the program's workload and designated memory usage in order for an application to perform reasonably well. Existing research mitigates the above problems to some extent, but programmer error could still cause memory leak by erroneously keeping memory references when they are no longer needed. We need a methodology for programmers to become resource aware, so that efficient, scalable, predictable and high performance programs may be written without the fear of resource leak. Linear logic has been recognized as the formalism of choice for resource tracking. It requires explicit introduction and elimination of resources and guarantees that a resource cannot be implicitly shared or abandoned, hence must be linear. Early languages based on linear logic focused on Curry-Howard correspondence. They began by limiting the expressive powers of the language and then reintroduced them by allowing controlled sharing which is necessary for recursive functions. However, only by deviating from Curry-Howard correspondence could later development actually address programming errors in resource usage. The contribution of this dissertation is a simple, safe, and efficient approach introducing linear resource ownership semantics into C++ (which is still a widely used language after 30 years since inception) through linear pointer, a smart pointer inspired by linear logic. By implementing various linear data structures and a parallel, multi-threaded memory allocator based on these data structures, this work shows that linear pointer is practical and efficient in the real world, and that it is possible to build a memory management stack that is entirely leak free. The dissertation offers some closing remarks on the difficulties a formal system would encounter when reasoning about a concurrent linear data algorithm, and what might be done to solve these problems. Computer science Concurrency C++ Data structures Linear pointer Memory allocator Memory management
4	Performance Optimisation of Discrete-Event Simulation Software on Multi-Core Computers / Prestandaoptimering av händelsestyrd simuleringsmjukvara på flerkärniga datorer Kaeslin, Alain E. January 2016 (has links) SIMLOX is a discrete-event simulation software developed by Systecon AB for analysing logistic support solution scenarios. To cope with ever larger problems, SIMLOX's simulation engine was recently enhanced with a parallel execution mechanism in order to take advantage of multi-core processors. However, this extension did not result in the desired reduction in runtime for all simulation scenarios even though the parallelisation strategy applied had promised linear speedup. Therefore, an in-depth analysis of the limiting scalability bottlenecks became necessary and has been carried out in this project. Through the use of a low-overhead profiler and microarchitecture analysis, the root causes were identified: atomic operations causing a high communication overhead, poor locality leading to translation lookaside buffer thrashing, and hot spots that consume significant amounts of CPU time. Subsequently, appropriate optimisations to overcome the limiting factors were implemented: eliminating the expensive operations, more efficient handling of heap memory through the use of a scalable memory allocator, and data structures that make better use of caches. Experimental evaluation using real world test cases demonstrated a speedup of at least 6.75x on an eight-core processor. Most cases even achieve a speedup of more than 7.2x. The various optimisations implemented further helped to lower run times for sequential execution by 1.5x or more. It can be concluded that achieving nearly linear speedup on a multi-core processor is possible in practice for discrete-event simulation. / SIMLOX är en kommersiell mjukvara utvecklad av Systecon AB, vars huvudsakliga funktion är en händelsestyrd simuleringskärna för analys av underhållslösningar för komplexa tekniska system. För hantering av stora problem så används parallellexekvering för simuleringen, vilket i teorin borde ge en nästan linjär skalning med antal trådar. Prestandaförbättringen som observerats i praktiken var dock ytterst begränsad, varför en ordentlig analys av skalbarheten har gjorts i detta projekt. Genom användandet av ett profileringsverktyg med liten overhead och mikroarkitektur-analys, så kunde orsakerna hittas: atomiska operationer som skapar mycket overhead för kommunikation, dålig lokalitet ger fragmentering vid översättning till fysiska adresser och dåligt utnyttjande av TLB-cachen, och vissa flaskhalsar som kräver mycket CPU-kraft. Därefter implementerades och testade optimeringar för att undvika de identifierade problem. Testade lösningar inkluderar eliminering av dyra operationer, ökad effektivitet i minneshantering genom skalbara minneshanteringsalgoritmer och implementation av datastrukturer som ger bättre lokalitet och därmed bättre användande av cache-strukturen. Verifiering på verkliga testfall visade på uppsnabbningar på åtminstone 6.75 gånger på en processor med 8 kärnor. De flesta fall visade på en uppsnabbning med en faktor större än 7.2. Optimeringarna gav även en uppsnabbning med en faktor på åtminstone 1.5 vid sekventiell exekvering i en tråd. Slutsatsen är därmed att det är möjligt att uppnå nästan linjär skalning med antalet kärnor för denna typ av händelsestyrd simulering. cache hierarchy caches communication overhead data structures discrete-event simulation heap memory linear speedup logistic support low-overhead profiler memory allocator microarchitecture microarchitecture analysis multi-core optimisation parallel execution profiler runtime scalability scalability bottlenecks scalable memory allocator simulation translation lookaside buffer translation lookaside buffer thrashing TLB Computer Sciences Datavetenskap (datalogi)

1

Page generated in 0.045 seconds