• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 345
  • 54
  • 41
  • 39
  • 23
  • 16
  • 15
  • 13
  • 8
  • 8
  • 4
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 745
  • 291
  • 279
  • 144
  • 100
  • 93
  • 90
  • 87
  • 79
  • 70
  • 65
  • 46
  • 44
  • 43
  • 38
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
481

Algorithms and data structures for cache-efficient computation: theory and experimental evaluation

Chowdhury, Rezaul Alam 28 August 2008 (has links)
Not available / text
482

Algorithms for distributed caching and aggregation

Tiwari, Mitul 29 August 2008 (has links)
Not available
483

DRAM-aware prefetching and cache management

Lee, Chang Joo, 1975- 11 February 2011 (has links)
Main memory system performance is crucial for high performance microprocessors. Even though the peak bandwidth of main memory systems has increased through improvements in the microarchitecture of Dynamic Random Access Memory (DRAM) chips, conventional on-chip memory systems of microprocessors do not fully take advantage of it. This results in underutilization of the DRAM system, in other words, many idle cycles on the DRAM data bus. The main reason for this is that conventional on-chip memory system designs do not fully take into account important DRAM characteristics. Therefore, the high bandwidth of DRAM-based main memory systems cannot be realized and exploited by the processor. This dissertation identifies three major performance-related characteristics that can significantly affect DRAM performance and makes a case for DRAM characteristic-aware on-chip memory system design. We show that on-chip memory resource management policies (such as prefetching, buffer, and cache policies) that are aware of these DRAM characteristics can significantly enhance entire system performance. The key idea of the proposed mechanisms is to send out to the DRAM system useful memory requests that can be serviced with low latency or in parallel with other requests rather than requests that are serviced with high latency or serially. Our evaluations demonstrate that each of the proposed DRAM-aware mechanisms significantly improves performance by increasing DRAM utilization for useful data. We also show that when employed together, the performance benefit of each mechanism is achieved additively: they work synergistically and significantly improve the overall system performance of both single-core and Chip MultiProcessor (CMP) systems. / text
484

Hardware transactional memory : a systems perspective

Rossbach, Christopher John 22 March 2011 (has links)
The increasing ubiquity of chip multiprocessor machines has made the need for accessible approaches to parallel programming all the more urgent. The current state of the art, based on threads and locks, requires the programmer to use mutual exclusion to protect shared resources, enforce invariants, and maintain consistency constraints. Despite decades of research effort, this approach remains fraught with difficulty. Lock-based programming is complex and error-prone, largely due to well-known problems such as deadlock, priority inversion, and poor composability. Tradeoffs between performance and complexity for locks remain unattractive. Coarse-grain locking is simple but introduces artificial sharing, needless serialization, and yields poor performance. Fine-grain locking can address these issues, but at a significant cost in complexity and maintainability. Transactional memory has emerged as a technology with the potential to address this need for better parallel programming tools. Transactions provide the abstraction of isolated, atomic execution of critical sections. The programmer specifies regions of code which access shared data, and the system is responsible for executing that code in a way that is isolated and atomic. The programmer need not reason about locks and threads. Transactional memory removes many of the pitfalls of locking: transactions are livelock- and deadlock-free and may be composed freely. Hardware transactional memory, which is the focus of this thesis, provides an efficient implementation of the TM abstraction. This thesis explores several key aspects of supporting hardware transactional memory (HTM): operating systems support and integration, architectural, design, and implementation considerations, and programmer-transparent techniques to improve HTM performance in the presence of contention. Using and supporting HTM in an OS requires innovation in both the OS and the architecture, but enables practical approaches and solutions to some long-standing OS problems. Innovations in transactional cache coherence protocols enable HTM support in the presence of multi-level cache hierarchies, rich HTM semantics such as suspend/resume and multiple transactions per thread context, and can provide the building blocks for support of flexible contention management policies without the need to trap to software handlers. We demonstrate a programmer-transparent hardware technique for using dependences between transactions to commit conflicting transactions, and suggest techniques to allow conflicting transactions to avoid performance-sapping restarts without using heuristics such as backoff. Both mechanisms yield better performance for workloads that have significant write-sharing. Finally, in the context of the MetaTM HTM model, this thesis contributes a high-fidelity cross-design comparison of representative proposals from the literature: the result is a comprehensive exploration of the HTM design space that compares the behavior of models of MetaTM (70, 75), LogTM (58, 94), and Sun's Rock (22). / text
485

A memory profiler for 3D graphics application using ninary instrumentation

Deo, Mrinal 25 July 2011 (has links)
This report describes the architecture and implementation of a memory profiler for 3D graphics applications. The memory profiling is done for parts of the program which runs on the graphics processor and is responsible for rendering the image. The shaders are parsed and every memory instruction is instrumented with additional instruction for profiling. The results are then transferred from the video memory to CPU memory. Profiling is done for a frame and completes in less than three minutes. The report also describes various analyses that can be done using the results obtained from this profiler. The report discusses the design of an analytical cache model that can be used to identify candidate memory buffers suitable for caching among all the buffers used by an application. The profiler can segregate results for reads and writes separately, can handle all formats of texture access instructions and predicated instructions. / text
486

Ανάπτυξη cache controller βασισμένο στον δίαυλο AHB bus / Cache controller based on AHB bus

Γερακάρης, Δημήτρης 16 May 2014 (has links)
Η παρούσα διπλωματική αποτελεί την προσπάθεια κατασκευής ενός cache controller βασισμένο στον AHB BUS. Η ανάπτυξή του έγινε ως επί το πλείστο στο Εργαστήριο Vlsi του τμήματος Μηχανικών Υπολογιστών και Πληροφορικής με την προοπτική να ενσωματωθεί σε ένα ευρύτερο υπάρχων σύστημα βασισμένο στον open source cpu της arm Cortex M0. Δοκιμάστηκε επιτυχώς σε FPGA του εργαστηρίου αλλά ακόμα δεν έχει χρησιμοποιηθεί σε «πραγματικές συνθήκες». Απώτερος στόχος είναι να χρησιμοποιηθεί στο εργαστήριο για την επιτάχυνση εφαρμογών που θα χρειαστούν εξωτερική μνήμη δηλ. μεγαλύτερη μνήμη από την embedded του FPGA. Αν και δεν δοκιμάστηκε σε κάποιο άλλο σύστημα έχει φτιαχτεί με γνώμονα το πρότυπο του AHB οπότε υποθετικά δεν θα έχει κάποιο πρόβλημα να ενσωματωθεί σε οποιοδήποτε συμβατό με τον δίαυλο σύστημα. Η λογική πίσω από την υλοποίηση του είναι να είναι σχετικά εύκολη η αλλαγή ορισμένων μεταβλητών ώστε να διαφοροποιείται ο controller βάση των αναγκών του καθενός. Οι προδιαγραφές δίνονται παρακάτω αν και πιθανόν εκτός των πλαισίων της διπλωματικής και εντός του 2014 να επανα-σχεδιαστεί ώστε να γίνει πλήρως modular. / Cache controller compatible with AHB bus in system Verilog.
487

Ανάπτυξη τεχνικής αύξησης της αξιοπιστίας των κρυφών μνημών πρώτου επιπέδου βασισμένη στη χωρική τοπικότητα των μπλοκ μνήμης

Μαυρόπουλος, Μιχαήλ 16 May 2014 (has links)
Στην παρούσα διπλωματική εργασία θα ασχοληθούμε με το πρόβλημα της αξιοπιστίας των κρυφών μνημών δεδομένων και εντολών πρώτου επιπέδου. Η υψηλή πυκνότητα ολοκλήρωσης και η υψηλή συχνότητα λειτουργίας των σύγχρονων ολοκληρωμένων κυκλωμάτων έχει οδηγήσει σε σημαντικά προβλήματα αξιοπιστίας, που οφείλονται είτε στην κατασκευή, είτε στη γήρανση των ολοκληρωμένων κυκλωμάτων. Στην παρούσα εργασία γίνεται αρχικά μια αποτίμηση της μείωσης της απόδοσης των κρυφών μνημών πρώτου επιπέδου όταν εμφανίζονται μόνιμα σφάλματα για διαφορετικές τεχνολογίες ολοκλήρωσης. Στη συνέχεια παρουσιάζεται μια νέα τεχνική αντιμετώπισης της επίδρασης των σφαλμάτων, η οποία βασίζεται στη πρόβλεψη της χωρικής τοπικότητας των μπλοκ μνήμης που εισάγονται στις κρυφές μνήμες πρώτου επιπέδου. Η αξιολόγηση της εν λόγω τεχνικής γίνεται με τη χρήση ενός εξομοιωτή σε επίπεδο αρχιτεκτονικής. / In this thesis we will work on the problem of reliability of first-level data and instruction cache memories. Technology scaling improvement is affecting the reliability of ICs due to increases in static and dynamic variations as well as wear out failures. First of all, in this work we try to estimate the impact of permanent faults in first level faulty caches. Then we propose a methodology to mitigate this negative impact of defective bits. Out methodology based on prediction of spatial locality of the incoming blocks to cache memory. Finally using cycle accurate simulation we showcase that our approach is able to offer significant benefits in cache performance.
488

Energy Management for Virtual Machines

Ye, Lei January 2013 (has links)
Current computing infrastructures use virtualization to increase resource utilization by deploying multiple virtual machines on the same hardware. Virtualization is particularly attractive for data center, cloud computing, and hosting services; in these environments computer systems are typically configured to have fast processors, large physical memory and huge storage capable of supporting concurrent execution of virtual machines. Subsequently, this high demand for resources is directly translating into higher energy consumption and monetary costs. Increasingly managing energy consumption of virtual machines is becoming critical. However, virtual machines make the energy management more challenging because a layer of virtualization separates hardware from the guest operating system executing inside a virtual machine. This dissertation addresses the challenge of designing energy-efficient storage, memory and buffer cache for virtual machines by exploring innovative mechanisms as well as existing approaches. We analyze the architecture of an open-source virtual machine platform Xen and address energy management on each subsystem. For storage system, we study the I/O behavior of the virtual machine systems. We address the isolation between virtual machine monitor and virtual machines, and increase the burstiness of disk accesses to improve energy efficiency. In addition, we propose a transparent energy management on main memory for any types of guest operating systems running inside virtual machines. Furthermore, we design a dedicated mechanism for the buffer cache based on the fact that data-intensive applications heavily rely on a large buffer cache that occupies a majority of physical memory. We also propose a novel hybrid mechanism that is able to improve energy efficiency for any memory access. All the mechanisms achieve significant energy savings while lowering the impact on performance for virtual machines.
489

Σχεδίαση και ανάπτυξη συστήματος κατανεμημένης διαμοιραζόμενης μνήμης για πολυεπεξεργαστή του ενός ολοκληρωμένου (CMP) / Design and development of a shared distributed memory system for a chip multiprocessor (CMP)

Αδαμίδης, Ανδρέας 09 February 2009 (has links)
Αντικείμενο της παρούσας μεταπτυχιακής εργασίας είναι ο σχεδιασμός και η ανάπτυξη συστήματος κατανεμημένης διαμοιραζόμενης μνήμης ως τμήμα της αρχιτεκτονικής πολυεπεξεργαστικού συστήματος SiScape. Λόγω των ιδιαιτεροτήτων της αρχιτεκτονικής αυτής, το σύστημα μνήμης της και συγκεκριμένα η κρυφή μνήμη δευτέρου επιπέδου που καθιστά δυνατή τη λειτουργία του, κρίθηκε απαραίτητο να σχεδιαστεί και να αναπτυχθεί από το μηδέν, προκειμένου να ανταποκριθεί στις απαιτήσεις της. Ο σχεδιασμός της κρυφής μνήμης δευτέρου επιπέδου περιγράφηκε στη γλώσσα περιγραφής υλικού VHDL. / The purpose of this master thesis is the design and development of a shared distributed memory system as part of the multiprocessor architecture SiScape. Because of the architecture's irregular structure, it was imperative that the memory system and particularly the second level cache that enables its functionality, was designed from scratch, to fill all of its requirements. The design of the second level cache was described using the VHDL hardware description language.
490

Enabling scalable online user interaction management through data warehousing of interaction histories / by Helen Thomas

Thomas, Helen 12 1900 (has links)
No description available.

Page generated in 0.12 seconds