• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 104
  • 19
  • 9
  • 7
  • 6
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 176
  • 176
  • 128
  • 114
  • 41
  • 35
  • 32
  • 29
  • 28
  • 25
  • 23
  • 15
  • 15
  • 14
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Arithmetic Computations and Memory Management Using a Binary Tree Encoding af Natural Numbers

Haraburda, David 12 1900 (has links)
Two applications of a binary tree data type based on a simple pairing function (a bijection between natural numbers and pairs of natural numbers) are explored. First, the tree is used to encode natural numbers, and algorithms that perform basic arithmetic computations are presented along with formal proofs of their correctness. Second, using this "canonical" representation as a base type, algorithms for encoding and decoding additional isomorphic data types of other mathematical constructs (sets, sequences, etc.) are also developed. An experimental application to a memory management system is constructed and explored using these isomorphic types. A practical analysis of this system's runtime complexity and space savings are provided, along with a proof of concept framework for both applications of the binary tree type, in the Java programming language.
52

DirectX12: A Resource Heap Type Copying Time Analysis

Törnblom, Simon, Hellman, Pontus January 2020 (has links)
Background The API DirectX 12 allows programmers to have more control over the GPUs memory management. This includes the ability to allocate resources on different types of memory heaps. But there is a lack of research on how these heap types affect the copying performance. Objectives The aim of this thesis is to benchmark the copying performance of the different heap types in DirectX 12 when increasing the data size. The heaps are tested with the three types of command queue that can be used to execute commands to the GPU. Method To answer our research question, a DirectX 12 prototype was implemented and used to copy increasing amount of data between different heap types. The copy operations were also combined with three different types of command queues to see if these have any impact on the performance. The tests ran on three different Nvidia graphic cards on the same computer setup, both to validate our results but also to spot any potential differences. Results The results from this study show that there is a difference in copying speed when copying data between resources that have been allocated on different heap types. The fastest to slowest were as follows: Default to Default, Upload to Default / Default to Readback and Upload to Readback. Using different types of command queues did not have an impact on performance with the exception of when data was copied from Default to Default on an RTX 2080. All of the tests that were carried out showed that the copying time scaled linearly with the data size. Conclusion This study shows the importance of allocating resources on the most suitable heap as there is a difference in copying time between them. In contrast, was the choice of command queue less important as this had no impact on performance in the majority of the tests. The results also show that the copying time scales linearly with the data size.
53

Accelerating machine learning with memory management and persistent memory

Wood, Andrew 07 February 2024 (has links)
Machine Learning (ML) is expensive: it requires machines that possess large compute capabilities, high memory and storage capacities, and exceedingly large amounts of time for models to train. When training a model, ML programs in general follow a standard blueprint: the model makes at least one “pass” through the dataset, where a single “pass” is comprised of supplying the data in chunks to the model. During each pass through the dataset, the model updates its internal state with the goal of improving predictive performance on the data that it sees. One of the primary reasons ML programs are expensive is an artifact of this blueprint. In fact, especially in the age of “big data,” the requirements to run ML programs are so large that they must be distributed across a cluster of machines as a single machine alone cannot meet all of the requirements. However, the landscape of hardware is changing. With the introduction of new memory technology called Persistent Memory (PM), a single machine can now satisfy program requirements that, in the past, they could not. Persistent Memory is unique in that it can play multiple roles within the memory hierarchy: the collection of memory devices a machine is equipped with. By utilizing the unique properties of PM, ML can be further optimized for program performance metrics such as runtime, crash consistency, etc. In this dissertation, I accelerate each stage of the ML training blueprint. First, I show that for algorithms that normally cannot execute on a single machine, the entire blueprint can be executed using PM. I then evaluate PM against other devices in a series of micro-benchmarks that emulate common memory operations used by ML programs. Through these micro-benchmarks, I provide guidelines for researchers to consider when optimizing their programs. Finally, I use these guidelines to accelerate the checkpointing operation: the process of recording the state of the ML program to persistent storage in a crash consistent manner.
54

Managing Memory for Power, Performance, and Thermal Efficiency

Tolentino, Matthew Edward 08 April 2009 (has links)
Extraordinary improvements in computing performance, density, and capacity have driven rapid increases in system energy consumption, motivating the need for energy-efficient performance. Harnessing the collective computational capacity of thousands of these systems can consume megawatts of electrical power, even though many systems may be underutilized for extended periods of time. At scale, powering and cooling unused or lightly loaded systems can waste millions of dollars annually. To combat this inefficiency, we propose system software, control systems, and architectural techniques to improve the energy efficiency of high-capacity memory systems while preserving performance. We introduce and discuss several new application-transparent, memory management algorithms as well as a formal analytical model of a power-state control system rooted in classical control theory we developed to proportionally scale memory capacity with application demand. We present a prototype implementation of this control-theoretic runtime system that we evaluate on sequential memory systems. We also present and discuss why the traditional performance-motivated approach of maximizing interleaving within memory systems is problematic and should be revisited in terms of power and thermal efficiency. We then present power-aware control techniques for improving the energy efficiency of symmetrically interleaved memory systems. Given the limitations of traditional interleaved memory configurations, we propose and evaluate unorthodox, asymmetrically interleaved memory configurations. We show that when coupled with our control techniques, significant energy savings can be achieved without sacrificing application performance or memory bandwidth. / Ph. D.
55

An Evaluation of the Linux Virtual Memory Manager to Determine Suitability for Runtime Variation of Memory

Muthukumaraswamy Sivakumar, Vijay 01 June 2007 (has links)
Systems that support virtual memory virtualize the available physical memory such that the applications running on them operate under the assumption that these systems have a larger amount of memory available than is actually present. The memory managers of these systems manage the virtual and the physical address spaces and are responsible for converting the virtual addresses used by the applications to the physical addresses used by the hardware. The memory managers assume that the amount of physical memory is constant and does not change during their period of operation. Some operating scenarios however, such as the power conservation mechanisms and virtual machine monitors, require the ability to vary the physical memory available at runtime, thereby making invalid the assumptions made by these memory managers. In this work we evaluate the suitability of the Linux Memory Manager, which assumes that the available physical memory is constant, for the purposes of varying the memory at run time. We have implemented an infrastructure over the Linux 2.6.11 kernel that enables the user to vary the physical memory available to the system. The available physical memory is logically divided into banks and each bank can be turned on or off independent of the others, using the new system calls we have added to the kernel. Apart from adding support for the new system calls, other changes had to be made to the Linux memory manager to support the runtime variation of memory. To evaluate the suitability for varying memory we have performed experiments with varying memory sizes on both the modified and the unmodified kernels. We have observed that the design of the existing memory manager is not well suited to support the runtime variation of memory; we provide suggestions to make it better suited for such purposes. Even though applications running on systems that support virtual memory do not use the physical memory directly and are not aware of the physical addresses they use, the amount of physical memory available for use affects the performance of the applications. The results of our experiments have helped us study the influence the amount of physical memory available for use has on the performance of various types of applications. These results can be used in scenarios requiring the ability to vary the memory at runtime to do so with least degradation in the application performance. / Master of Science
56

A component-based approach to proving the correctness of the Schorr-Waite algorithm

Singh, Amrinder 23 August 2007 (has links)
This thesis presents a component-based approach to proving the correctness of programs involving pointers. Unlike previous work, our component-based approach supports modular reasoning, which is essential to the scalability of systems. Specifically, we specify the behavior of a graph-marking algorithm known as the Schorr-Waite algorithm, implement it using a component that captures the behavior and performance benefits of pointers, and prove that the implementation is correct with respect to the specification. We use the Resolve language in our example, which is an integrated programming and specification language that supports modular reasoning. The behavior of the algorithm is fully specified using custom definitions, pre- and post-conditions, and a complex loop invariant. Additional operations for the Resolve pointer component are introduced that preserve the accessibility of a system. These operations are used in the implementation of the algorithm. They simplify the proof of correctness and make the code shorter. / Master of Science
57

Statisk detektering av minneshanteringsfel i C/C++ / Static detection of memory management errors in C/C++

Javanbakhti, Reza, Pesola, Jimmy January 2006 (has links)
<p>Det här examensarbetet är baserat på idéer ur ett uppdrag från företaget Saab Aerotech men är ett eget arbete.</p><p>Målet var att undersöka om det finns behov av ett verktyg som statiskt kan detektera dynamiska minneshanteringsproblem, som till exempel minnesläckage, i applikationer skrivna i C/C++. På grund av att minneshanteringsfel i C/C++ länge har varit ett känt problem undersökte vi detta och de befintliga lösningarna till det.</p><p>Vi fann två metoder till lösningar som de flesta verktyg använde sig av; statisk och dynamisk detektering. De flesta verktyg löste problemet genom att dynamiskt detektera minnesläckor och andra brister som till exempel buffer overflows. Ett verktyg löste dock problemet genom att statisk detektera minneshanteringsfel i källkoden för applikationerna. Eftersom alla befintliga lösningar har någon form av ineffektivitet så har vi undersökt möjligheten att utveckla ett mer effektivt verktyg. Vi har kommit fram till att denna möjlighet finns men det kräver enormt mycket tid och arbete att göra ett komplett verktyg som detekterar minneshanteringsfel statiskt.</p><p>Vår prototyp detekterar dynamiska minneshanteringsproblem i källkoden statiskt. Vi har använt oss av hjälpverktygen Flex och Bison för att utveckla vår prototyp av verktyget. Prototypen kan analysera källkod skriven i programspråken C och C++ och klarar att detektera minnesläckage, felaktiga avallokeringar av minne, dangling pointers, samt läsning från och skrivning till ogiltiga minnesområden. På grund av tidsbrist har vi i nuläget inte implementerat något stöd för klasser och objekt i prototypen.</p> / <p>This bachelor’s project is our own project, but it is based on ideas from an assignment from the Saab Aerotech company.</p><p>The goal was to investigate if there is a need for a tool that statically can detect dynamic memory management errors, such as memory leaks, in applications written in C/C++. Since the problem of memory management errors in the C/C++ languages has been known for a long time, we decided to investigate this and the existing solutions.</p><p>We found that most tools used two methods as solutions; static and dynamic detection. Most of these tools solve the problem by dynamically detecting memory leaks and other deficiencies such as buffer overflows. However, one of these tools used static detection of these deficiencies by scanning the source code of the applications. Since all the existing solutions have some kind of inefficiency, we have investigated the possibility to develop a more efficient tool. We concluded that this is possible but it will take a lot of time and effort to implement a complete tool that statically detects memory management errors.</p><p>Our prototype statically detects dynamic memory management problems in the source code. We have used the tools Flex and Bison to develop our prototype of a static detection tool. The prototype analyzes source code written in the programming languages C and C++ and is capable of detecting memory leaks, invalid deallocations of memory, dangling pointers and reading from and writing to invalid memory areas. Currently, due to lack of time, we have not implemented any support for classes and objects in the prototype.</p>
58

Statisk detektering av minneshanteringsfel i C/C++ / Static detection of memory management errors in C/C++

Javanbakhti, Reza, Pesola, Jimmy January 2006 (has links)
Det här examensarbetet är baserat på idéer ur ett uppdrag från företaget Saab Aerotech men är ett eget arbete. Målet var att undersöka om det finns behov av ett verktyg som statiskt kan detektera dynamiska minneshanteringsproblem, som till exempel minnesläckage, i applikationer skrivna i C/C++. På grund av att minneshanteringsfel i C/C++ länge har varit ett känt problem undersökte vi detta och de befintliga lösningarna till det. Vi fann två metoder till lösningar som de flesta verktyg använde sig av; statisk och dynamisk detektering. De flesta verktyg löste problemet genom att dynamiskt detektera minnesläckor och andra brister som till exempel buffer overflows. Ett verktyg löste dock problemet genom att statisk detektera minneshanteringsfel i källkoden för applikationerna. Eftersom alla befintliga lösningar har någon form av ineffektivitet så har vi undersökt möjligheten att utveckla ett mer effektivt verktyg. Vi har kommit fram till att denna möjlighet finns men det kräver enormt mycket tid och arbete att göra ett komplett verktyg som detekterar minneshanteringsfel statiskt. Vår prototyp detekterar dynamiska minneshanteringsproblem i källkoden statiskt. Vi har använt oss av hjälpverktygen Flex och Bison för att utveckla vår prototyp av verktyget. Prototypen kan analysera källkod skriven i programspråken C och C++ och klarar att detektera minnesläckage, felaktiga avallokeringar av minne, dangling pointers, samt läsning från och skrivning till ogiltiga minnesområden. På grund av tidsbrist har vi i nuläget inte implementerat något stöd för klasser och objekt i prototypen. / This bachelor’s project is our own project, but it is based on ideas from an assignment from the Saab Aerotech company. The goal was to investigate if there is a need for a tool that statically can detect dynamic memory management errors, such as memory leaks, in applications written in C/C++. Since the problem of memory management errors in the C/C++ languages has been known for a long time, we decided to investigate this and the existing solutions. We found that most tools used two methods as solutions; static and dynamic detection. Most of these tools solve the problem by dynamically detecting memory leaks and other deficiencies such as buffer overflows. However, one of these tools used static detection of these deficiencies by scanning the source code of the applications. Since all the existing solutions have some kind of inefficiency, we have investigated the possibility to develop a more efficient tool. We concluded that this is possible but it will take a lot of time and effort to implement a complete tool that statically detects memory management errors. Our prototype statically detects dynamic memory management problems in the source code. We have used the tools Flex and Bison to develop our prototype of a static detection tool. The prototype analyzes source code written in the programming languages C and C++ and is capable of detecting memory leaks, invalid deallocations of memory, dangling pointers and reading from and writing to invalid memory areas. Currently, due to lack of time, we have not implemented any support for classes and objects in the prototype.
59

Systems and applications for persistent memory

Dulloor, Subramanya R. 07 January 2016 (has links)
Performance-hungry data center applications demand increasingly higher performance from their storage in addition to larger capacity memory at lower cost. While the existing storage technologies (e.g., HDD and flash-based SSD) are limited in their performance, the most prevalent memory technology (DRAM) is unable to address the capacity and cost requirements of these applications. Emerging byte-addressable, non-volatile memory technologies (such as PCM and RRAM) offer performance within an order of magnitude of DRAM, prompting their inclusion in the processor memory subsystem. Such load/store accessible non-volatile or persistent memory (referred to as NVM or PM) introduces an interesting new tier that bridges the performance gap between DRAM and PM, and serves the role of fast storage or slower memory. However, PM has several implications on system design, both hardware and software: (i) the hardware caching mechanisms, while necessary for acceptable performance, complicate the ordering and durability of stores to PM, (ii) the high performance of PM (compared to NAND) and the fact that it is byte-addressable necessitate rethinking of the system software to manage PM and the interfaces to expose PM to the applications, and (iii) the future memory-based applications that will likely employ systems coupling PM with DRAM (for cost and capacity reasons) must be extremely conscious of the performance characteristics of PM and the challenges of using fast vs. slow memory in ways that best meet their performance demands. The key contribution of our research is a set of technologies that addresses these challenges in a bottom-up fashion. Since the real hardware is not yet available, we first implement a hardware emulator that can faithfully emulate the relative performance characteristics of DRAM and PM in a system with separate DRAM and emulated PM regions. We use this emulator to perform all of our evaluations. Next we explore system software support to enable low-overhead PM access by new and legacy applications. Towards this end, we implement PMFS, an optimized light-weight POSIX file system that exploits PM's byte-addressability to avoid overheads of block-oriented storage and enable direct PM access by applications (with memory-mapped I/O). To provide strong consistency guarantees, PMFS requires only a simple hardware primitive that provides software enforceable guarantees of durability and ordering of stores to PM. We demonstrate that PMFS achieves significant (up to an order of magnitude) gains over traditional file systems (such as ext4) on a RAMDISK-like PM block device. Finally, we address the problem of designing memory-based applications for systems with both DRAM and PM by extending our system software to manage both the tiers. We demonstrate for several representative large in-memory applications that it is possible to use a small amount of fast DRAM and large amounts of slower PM without a proportional impact to an application's performance, provided the placement of data structures is done in a careful fashion. To simplify the application programming, we implement a set of libraries and automatic tools (called X-Mem) that enables programmers to achieve optimal data placement with minimal effort on their part. Finally, we demonstrate the potentially large benefits of application-driven memory tiering with X-Mem across a range of applications.
60

An adaptive software transactional memory support for multi-core programming

Chan, Kinson., 陳傑信. January 2009 (has links)
published_or_final_version / Computer Science / Master / Master of Philosophy

Page generated in 0.0951 seconds