Global ETD Search

1	Energy Efficiency of Scratch-Pad Memory at 65 nm and Below: An Empirical Study Takase, Hideki, Tomiyama, Hiroyuki, Zeng, Gang, Takada, Hiroaki 07 1900 (has links) No description available. Energy consumption scratch-pad memory embedded systems deep submicron
2	Υλοποίηση DMA για υπολογιστικό σύστημα με scratch pad μνήμη και βελτιστοποιημένη υλοποίηση εφαρμογών Μπαλταγιάννης, Αγαμέμνων 18 March 2009 (has links) Κύριος σκοπός της εργασίας είναι η υλοποίηση ενός υπολογιστικού συστήματος με Scratch pad μνήμη και η διαχείριση της μνήμης μέσω ενσωματωμένου λογισμικού. Αρχικά παρουσιάζονται τα πλεονεκτήματα και τα μειονεκτήματα ενός συστήματος που χρησιμοποιεί μνήμη Scratch pad σε σύγκριση με ένα αντίστοιχο σύστημα με cache. Μετά σχεδιάζουμε το σύστημα μας χρησιμοποιώντας την γλώσσα περιγραφής υλικού VHDL και λαμβάνουμε πειραματικές μετρήσεις οι οποίες προκύπτουν από την μέτρηση των κύκλων εκτέλεσης ενός αντιπροσωπευτικού προγράμματος. Η προτεινόμενη αρχιτεκτονική με Scratch pad και η τεχνική προγραμματισμού της αποφέρουν μια βελτίωση της απόδοσης κατά 36% σε σχέση με την αντίστοιχη αρχιτεκτονική με cache. Αυτό οφείλεται στις σημαντικά λιγότερες αστοχίες που παρουσιάζει η Scratch pad όταν προγραμματιστεί κατάλληλα καθώς ο DMA ελεγκτής έχει τη δυνατότητα να μεταφέρει τα δεδομένα παράλληλα με την εκτέλεση του προγράμματος. / The main purpose of this master thesis is the implementation of a computer system using scratch pad memory including memory management via embedded software. Initially we present the pros and cons of a system using scratch pad memory, in comparison to a system using cache memory. We then design our system using the hardware description language VHDL and we compare the performance with an equivalent architecture using cache memory. This is done by counting the clock cycles needed in order to run a sample program. The proposed scratch pad architecture and the programming technique used produced a 36% better performance in comparison to an equivalent cache memory architecture. This is due to the less misses that a scratch pad memory presents, when programmed efficiently. 005.435 Scratch pad VHDL DMA
3	Υλοποίηση αρχιτεκτονικής για επεξεργαστή VLIW με χρήση μνήμης Scratch-pad Γιαννακοπούλου, Γεωργία, Τσούνης, Γεώργιος 16 June 2011 (has links) Στην παρούσα διπλωματική εργασία, γίνεται η περιγραφή των χαρακτηριστικών των VLIW επεξεργαστών, συγκριτικά με άλλους επεξεργαστές, και στη συνέχεια αναλύεται ο τρόπος με τον οποίο υλοποιήθηκε ένα σύστημα, βασισμένο στη VLIW αρχιτεκτονική. Επιπλέον, παρουσιάζονται τα χαρακτηριστικά των Scratch-pad μνημών, συγκρίνοντάς τα με αυτά των Cache, ενώ υλοποιούνται Scratch-pad μνήμες, στις οποίες θα γίνεται η αποθήκευση των εντολών και των δεδομένων προγραμμάτων που θα εκτελεί ο επεξεργαστής VLIW. Τέλος, αναπτύχθηκε μια εφαρμογή επεξεργασίας εικόνας, με σκοπό να γίνει ο έλεγχος της συμπεριφοράς του συστήματος. / This project describes the characteristics of VLIW processors, compared to other types of processors, and analyses the way in which a system, based on the VLIW architecture, was created. In addition, Scratch-pad memories are compared to Cache memories and added to the system, in order to store the instructions and data of programs being executed by the VLIW processor. Finally, an image processing algorithm was developed with a view to simulate the system's behavior. Επεξεργαστές Μνήμες 621.395 Processors Memories VLIW Scratch-pad
4	Improving Code Overlay Performance by Pre-fetching in Scratch Pad Memory Systems January 2011 (has links) abstract: Advances in electronics technology and innovative manufacturing processes have driven the semiconductor industry towards extensive miniaturization & ever greater integration of chip design. One consequence of this sustained evolution has been the growing relative cost of accessing off-chip components with external memory being one of the dominant contributors. In embedded systems and applications, where power consumption and cost are extremely crucial factors, the use of on chip Scratch Pad Memories (SPMs) has proven to be a good alternative to caches. SPMs are more efficient than on-chip caches in a wide variety of aspects including energy consumption, power dissipation, speed performance, area, and timing predictability. However, at the same time, they entail explicit software-level management. Specifically, the system performance depends upon overlay scheme for mapping code and data onto the size-limited SPMs. It has been found that for applications with large code sizes, the overlay overhead cost becomes significant. This work aims to evaluate and implement pre-fetching as a performance improvement technique for SPMs. It is implemented in code overlay manager, provided with the Cell Broadband Engine (CBE) Synergistic Processing Unit (SPU) compiler from IBM, spu-gcc. Four different approaches proposed in this work use profiling information to predict pre-fetch calls. The pre-fetching technique achieves considerable performance improvement by hiding some of the code overlay cost behind active computations by fetching the required code segment in advance into SPM. Experimental results supporting this claim are obtained using the IBM Cell architecture platform with substantial gain of more than 30%. / Dissertation/Thesis / M.S. Computer Science 2011 Computer Science Code Overlay Multicore Scratch Pad Memory
5	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)
6	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)
7	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)
8	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)
9	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)
10	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)

Search results