1 |
Evaluation of caches and cache coherencySehat, Kamiar January 1992 (has links)
No description available.
|
2 |
Quantum entanglement and classical informationHenderson, L. January 2000 (has links)
No description available.
|
3 |
DRAM-aware prefetching and cache managementLee, Chang Joo, 1975- 11 February 2011 (has links)
Main memory system performance is crucial for high performance microprocessors.
Even though the peak bandwidth of main memory systems has increased
through improvements in the microarchitecture of Dynamic Random Access Memory
(DRAM) chips, conventional on-chip memory systems of microprocessors do
not fully take advantage of it. This results in underutilization of the DRAM system,
in other words, many idle cycles on the DRAM data bus. The main reason for this
is that conventional on-chip memory system designs do not fully take into account
important DRAM characteristics. Therefore, the high bandwidth of DRAM-based
main memory systems cannot be realized and exploited by the processor.
This dissertation identifies three major performance-related characteristics
that can significantly affect DRAM performance and makes a case for DRAM
characteristic-aware on-chip memory system design. We show that on-chip memory
resource management policies (such as prefetching, buffer, and cache policies)
that are aware of these DRAM characteristics can significantly enhance entire system
performance. The key idea of the proposed mechanisms is to send out to the
DRAM system useful memory requests that can be serviced with low latency or in
parallel with other requests rather than requests that are serviced with high latency or serially. Our evaluations demonstrate that each of the proposed DRAM-aware
mechanisms significantly improves performance by increasing DRAM utilization
for useful data. We also show that when employed together, the performance benefit
of each mechanism is achieved additively: they work synergistically and significantly
improve the overall system performance of both single-core and Chip
MultiProcessor (CMP) systems. / text
|
4 |
Fair and high performance shared memory resource managementEbrahimi, Eiman 31 January 2012 (has links)
Chip multiprocessors (CMPs) commonly share a large portion of memory
system resources among different cores. Since memory requests from
different threads executing on different cores significantly interfere
with one another in these shared resources, the design of the shared
memory subsystem is crucial for achieving high performance and
fairness.
Inter-thread memory system interference has different implications
based on the type of workload running on a CMP. In multi-programmed
workloads, different applications can experience significantly
different slowdowns. If left uncontrolled, large disparities in
slowdowns result in low system performance and make system software's
priority-based thread scheduling policies ineffective. In a single
multi-threaded application, memory system interference between threads
of the same application can slow each thread down significantly. Most
importantly, the critical path of execution can also be
significantly slowed down, resulting in increased application
execution time.
This dissertation proposes three mechanisms that address different
shortcomings of current shared resource management techniques targeted
at multi-programmed workloads, and one mechanism which speeds up a
single multi-threaded application by managing main-memory related
interference between its different threads.
With multi-programmed workloads, the key idea is that both demand- and
prefetch-caused inter-application interference should be taken into
account in shared resource management techniques across the entire
shared memory system. Our evaluations demonstrate that doing so
significantly improves both system performance and fairness compared
to the state-of-the-art. When executing a single multi-threaded
application on a CMP, the key idea is to take into account the
inter-dependence of threads in memory scheduling decisions. Our
evaluation shows that doing so significantly reduces the execution
time of the multi-threaded application compared to using
state-of-the-art memory schedulers designed for multi-programmed
workloads.
This dissertation concludes that the performance and fairness of CMPs
can be significantly improved by better management of inter-thread
interference in the shared memory resources, both for multi-programmed
workloads and multi-threaded applications. / text
|
5 |
Performance Prediction of Parallel Programs in a Linux EnvironmentFarooq, Mohammad Habibur Rahman & Qaisar January 2010 (has links)
Context. Today’s parallel systems are widely used in different computational tasks. Developing parallel programs to make maximum use of the computing power of parallel systems is tricky and efficient tuning of parallel programs is often very hard. Objectives. In this study we present a performance prediction and visualization tool named VPPB for a Linux environment, which had already been introduced by Broberg et.al, [1] for a Solaris2.x environment. VPPB shows the predicted behavior of a multithreaded program using any number of processors and the behavior is shown on two different graphs. The prediction is based on a monitored uni-processor execution. Methods. An experimental evaluation was carried out to validate the prediction reliability of the developed tool. Results. Validation of prediction is conducted, using an Intel multiprocessor with 8 processors and PARSEC 2.0 benchmark suite application programs. The validation shows that the speed-up predictions are +/-7% of a real execution. Conclusions. The experimentation of the VPPB tool showed that the prediction of VPPB is reliable and the incurred overhead into the application programs is low. / contact: +46(0)736368336
|
6 |
Reducing DRAM Row Activations with Eager WritebackJeon, Myeongjae 06 September 2012 (has links)
This thesis describes and evaluates a new approach to optimizing DRAM performance and energy consumption that is based on eagerly writing dirty cache lines to DRAM. Under this approach, dirty cache lines that have not been recently accessed are eagerly written to DRAM when the corresponding row has been activated by an ordinary access, such as a read. This approach enables clustering of reads and writes that target the same row, resulting in a significant reduction in row activations. Specifically, for 29 applications, it reduces the number of DRAM row activations by an average of 38% and a maximum of 81%. The results from a full system simulator show that for the 29 applications, 11 have performance improvements between 10% and 20%, and 9 have improvements in excess of 20%. Furthermore, 10 consume between 10% and 20% less DRAM energy, and 10 have energy consumption reductions in excess of 20%.
|
7 |
Hierarchical Matrix Techniques on Massively Parallel ComputersIzadi, Mohammad 11 December 2012 (has links) (PDF)
Hierarchical matrix (H-matrix) techniques can be used to efficiently treat dense matrices. With an H-matrix, the storage
requirements and performing all fundamental operations, namely matrix-vector multiplication, matrix-matrix multiplication and matrix inversion
can be done in almost linear complexity.
In this work, we tried to gain even further
speedup for the H-matrix arithmetic by utilizing multiple processors. Our approach towards an H-matrix distribution
relies on the splitting of the index set.
The main results achieved in this work based on the index-wise H-distribution are: A highly scalable algorithm for the H-matrix truncation and matrix-vector multiplication, a scalable algorithm for the H-matrix matrix multiplication, a limited scalable algorithm for the H-matrix inversion for a large number of processors.
|
8 |
Energy Management for Virtual MachinesYe, Lei January 2013 (has links)
Current computing infrastructures use virtualization to increase resource utilization by deploying multiple virtual machines on the same hardware. Virtualization is particularly attractive for data center, cloud computing, and hosting services; in these environments computer systems are typically configured to have fast processors, large physical memory and huge storage capable of supporting concurrent execution of virtual machines. Subsequently, this high demand for resources is directly translating into higher energy consumption and monetary costs. Increasingly managing energy consumption of virtual machines is becoming critical. However, virtual machines make the energy management more challenging because a layer of virtualization separates hardware from the guest operating system executing inside a virtual machine. This dissertation addresses the challenge of designing energy-efficient storage, memory and buffer cache for virtual machines by exploring innovative mechanisms as well as existing approaches. We analyze the architecture of an open-source virtual machine platform Xen and address energy management on each subsystem. For storage system, we study the I/O behavior of the virtual machine systems. We address the isolation between virtual machine monitor and virtual machines, and increase the burstiness of disk accesses to improve energy efficiency. In addition, we propose a transparent energy management on main memory for any types of guest operating systems running inside virtual machines. Furthermore, we design a dedicated mechanism for the buffer cache based on the fact that data-intensive applications heavily rely on a large buffer cache that occupies a majority of physical memory. We also propose a novel hybrid mechanism that is able to improve energy efficiency for any memory access. All the mechanisms achieve significant energy savings while lowering the impact on performance for virtual machines.
|
9 |
Ασφαλές σύστημα μνήμης με άμυνα παραποίησης (tamper proof) / Secure memory system with counterfeiting defense (tamper proof)Σταχούλης, Δημήτριος 19 January 2011 (has links)
Η παρούσα διπλωματική εργασία αναφέρεται στην ασφάλεια ενός συστήματος που χρησιμοποιεί κάποιο είδος μνήμης για αποθήκευση πληροφορίας. Πιο συγκεκριμένα αναφέρονται τρεις μέθοδοι προστασίας των αποθηκευμένων πληροφοριών της μνήμη. Όμως επειδή το ενδιαφέρον στρέφεται στην απόλυτη προστασία απόρρητων δεδομένων, με βάση αυτό αξιολογούνται οι παραπάνω μέθοδοι. Καταλήγουμε λοιπόν στην χρήση μιας εξ αυτών , όπου κάνουμε χρήση της αναξιοπιστίας και της αστάθειας της μνήμης υπό συγκεκριμένες συνθήκες τροφοδοσίας. Προσδιορίζουμε μέσω προσομοίωσης τα σημεία εκείνα στα οποία η μνήμη μετά την εφαρμογή ενός συγκεκριμένου εύρους τάσης τροφοδοσίας παύει στη συνέχεια να έχει την μέχρι πρότινος αποθηκευμένη απόρρητη πληροφορία. Τα αποτελέσματα αυτά μπορούν να χρησιμοποιηθούν για την ανάπτυξη ενός συστήματος ασφαλείας βασισμένο στην εφαρμοζόμενη τάση τροφοδοσίας της μνήμης για προστασία των αποθηκευμένων δεδομένων. / The present diplomatic work is reported in the safety of system that uses some type of memory for storage of information. They are more concretely reported three methods of protection the stored information in memory. However because the interest is turned in the absolute protection of confidential data, with base this are evaluated these three methods. We lead therefore to the use one from these, where we make use of unreliability and instability of memory under concrete conditions of catering. We determine via simulation that points in which the memory afterwards the application of concrete breadth of tendency of catering ceases then it has up the stored confidential information. These results can be used for the growth of system of safety based on the applied tendency of catering in memory cell for protection of stored data.
|
10 |
Effekten av chunking och association vid inlärningen av kinesiska teckenHolmberg, Lina January 2018 (has links)
Learning Chinese characters is challenging for many people, especially for those who were not born in Chinese-speaking countries. Applying the theory of chunking, constructing stories containing compound characters as mnemonic techniques to form associations within the learners, this study examines if this method can facilitate immediate learning and retention of Chinese characters. A case study strategy is adopted and a native language teaching class of Chinese in Sweden with six pupils is chosen as the case. Six participants were divided into an experimental group and a control group. The participants in the experimental group learned Chinese characters with the method of chunking and association, while the control group learned characters with the stroke order. The tests results indicated that the experimental group received better results than the control group, especially in the character tests. The experimental group used chunking, radicals and associative interpretations as strategies in the character and meaning tests. The control group on the other hand passed the meaning tests very well. Surprisingly, the control group also used associative interpretation and only scored lower than the experimental group by a small margin in the meaning tests. The experimental group experienced that chunking a character into its corresponding chunks and radicals facilitates the memorization of characters. Consequently, this case study concludes that the method of chunking and association is beneficial for the understanding and learning of Chinese characters. This case study also concludes that the knowledge of the method of chunking and association can improve the learning of the meaning of characters.
|
Page generated in 0.2016 seconds