Global ETD Search

41	Replacement and placement policies for prefetched lines. January 1998 (has links) by Sze Siu Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 119-122). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overlapping Computations with Memory Accesses --- p.3 / Chapter 1.2 --- Cache Line Replacement Policies --- p.4 / Chapter 1.3 --- The Rest of This Paper --- p.4 / Chapter 2 --- A Brief Review of IAP Scheme --- p.6 / Chapter 2.1 --- Embedded Hints for Next Data References --- p.6 / Chapter 2.2 --- Instruction Opcode and Addressing Mode Prefetching --- p.8 / Chapter 2.3 --- Chapter Summary --- p.9 / Chapter 3 --- Motivation --- p.11 / Chapter 3.1 --- Chapter Summary --- p.14 / Chapter 4 --- Related Work --- p.15 / Chapter 4.1 --- Existing Replacement Algorithms --- p.16 / Chapter 4.2 --- Placement Policies for Cache Lines --- p.18 / Chapter 4.3 --- Chapter Summary --- p.20 / Chapter 5 --- Replacement and Placement Policies of Prefetched Lines --- p.21 / Chapter 5.1 --- IZ Cache Line Replacement Policy in IAP scheme --- p.22 / Chapter 5.1.1 --- The Instant Zero Scheme --- p.23 / Chapter 5.2 --- Priority Pre-Updating and Victim Cache --- p.27 / Chapter 5.2.1 --- Priority Pre-Updating --- p.27 / Chapter 5.2.2 --- Priority Pre-Updating for Cache --- p.28 / Chapter 5.2.3 --- Victim Cache for Unreferenced Prefetch Lines --- p.28 / Chapter 5.3 --- Prefetch Cache for IAP Lines --- p.31 / Chapter 5.4 --- Chapter Summary --- p.33 / Chapter 6 --- Performance Evaluation --- p.34 / Chapter 6.1 --- Methodology and metrics --- p.34 / Chapter 6.1.1 --- Trace Driven Simulation --- p.35 / Chapter 6.1.2 --- Caching Models --- p.36 / Chapter 6.1.3 --- Simulation Models and Performance Metrics --- p.39 / Chapter 6.2 --- Simulation Results --- p.43 / Chapter 6.2.1 --- General Results --- p.44 / Chapter 6.3 --- Simulation Results of IZ Replacement Policy --- p.49 / Chapter 6.3.1 --- Analysis To IZ Cache Line Replacement Policy --- p.50 / Chapter 6.4 --- Simulation Results for Priority Pre-Updating with Victim Cache --- p.52 / Chapter 6.4.1 --- PPUVC in Cache with IAP Scheme --- p.52 / Chapter 6.4.2 --- PPUVC in prefetch-on-miss Cache --- p.54 / Chapter 6.5 --- Prefetch Cache --- p.57 / Chapter 6.6 --- Chapter Summary --- p.63 / Chapter 7 --- Architecture Without LOAD-AND-STORE Instructions --- p.64 / Chapter 8 --- Conclusion --- p.66 / Chapter A --- CPI Due to Cache Misses --- p.68 / Chapter A.1 --- Varying Cache Size --- p.68 / Chapter A.1.1 --- Instant Zero Replacement Policy --- p.68 / Chapter A.1.2 --- Priority Pre-Updating with Victim Cache --- p.70 / Chapter A.1.3 --- Prefetch Cache --- p.73 / Chapter A.2 --- Varying Cache Line Size --- p.75 / Chapter A.2.1 --- Instant Zero Replacement Policy --- p.75 / Chapter A.2.2 --- Priority Pre-Updating with Victim Cache --- p.77 / Chapter A.2.3 --- Prefetch Cache --- p.80 / Chapter A.3 --- Varying Cache Set Associative --- p.82 / Chapter A.3.1 --- Instant Zero Replacement Policy --- p.82 / Chapter A.3.2 --- Priority Pre-Updating with Victim Cache --- p.84 / Chapter A.3.3 --- Prefetch Cache --- p.87 / Chapter B --- Simulation Results of IZ Replacement Policy --- p.89 / Chapter B.1 --- Memory Delay Time Reduction --- p.89 / Chapter B.1.1 --- Varying Cache Size --- p.89 / Chapter B.1.2 --- Varying Cache Line Size --- p.91 / Chapter B.1.3 --- Varying Cache Set Associative --- p.93 / Chapter C --- Simulation Results of Priority Pre-Updating with Victim Cache --- p.95 / Chapter C.1 --- PPUVC in IAP Scheme --- p.95 / Chapter C.1.1 --- Memory Delay Time Reduction --- p.95 / Chapter C.2 --- PPUVC in Cache with Prefetch-On-Miss Only --- p.101 / Chapter C.2.1 --- Memory Delay Time Reduction --- p.101 / Chapter D --- Simulation Results of Prefetch Cache --- p.107 / Chapter D.1 --- Memory Delay Time Reduction --- p.107 / Chapter D.1.1 --- Varying Cache Size --- p.107 / Chapter D.1.2 --- Varying Cache Line Size --- p.109 / Chapter D.1.3 --- Varying Cache Set Associative --- p.111 / Chapter D.2 --- Results of the Three Replacement Policies --- p.113 / Chapter D.2.1 --- Varying Cache Size --- p.113 / Chapter D.2.2 --- Varying Cache Line Size --- p.115 / Chapter D.2.3 --- Varying Cache Set Associative --- p.117 / Bibliography --- p.119 Cache memory Memory management (Computer science) Random access memory
42	Unified on-chip multi-level cache management scheme using processor opcodes and addressing modes. January 1996 (has links) by Stephen Siu-ming Wong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 164-170). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cache Memory --- p.2 / Chapter 1.2 --- System Performance --- p.3 / Chapter 1.3 --- Cache Performance --- p.3 / Chapter 1.4 --- Cache Prefetching --- p.5 / Chapter 1.5 --- Organization of Dissertation --- p.7 / Chapter 2 --- Related Work --- p.8 / Chapter 2.1 --- Memory Hierarchy --- p.8 / Chapter 2.2 --- Cache Memory Management --- p.10 / Chapter 2.2.1 --- Configuration --- p.10 / Chapter 2.2.2 --- Replacement Algorithms --- p.13 / Chapter 2.2.3 --- Write Back Policies --- p.15 / Chapter 2.2.4 --- Cache Miss Types --- p.16 / Chapter 2.2.5 --- Prefetching --- p.17 / Chapter 2.3 --- Locality --- p.18 / Chapter 2.3.1 --- Spatial vs. Temporal --- p.18 / Chapter 2.3.2 --- Instruction Cache vs. Data Cache --- p.20 / Chapter 2.4 --- Why Not a Large L1 Cache? --- p.26 / Chapter 2.4.1 --- Critical Time Path --- p.26 / Chapter 2.4.2 --- Hardware Cost --- p.27 / Chapter 2.5 --- Trend to have L2 Cache On Chip --- p.28 / Chapter 2.5.1 --- Examples --- p.29 / Chapter 2.5.2 --- Dedicated L2 Bus --- p.31 / Chapter 2.6 --- Hardware Prefetch Algorithms --- p.32 / Chapter 2.6.1 --- One Block Look-ahead --- p.33 / Chapter 2.6.2 --- Chen's RPT & similar algorithms --- p.34 / Chapter 2.7 --- Software Based Prefetch Algorithm --- p.38 / Chapter 2.7.1 --- Prefetch Instruction --- p.38 / Chapter 2.8 --- Hybrid Prefetch Algorithm --- p.40 / Chapter 2.8.1 --- Stride CAM Prefetching --- p.40 / Chapter 3 --- Simulator --- p.43 / Chapter 3.1 --- Multi-level Memory Hierarchy Simulator --- p.43 / Chapter 3.1.1 --- Multi-level Memory Support --- p.45 / Chapter 3.1.2 --- Non-blocking Cache --- p.45 / Chapter 3.1.3 --- Cycle-by-cycle Simulation --- p.47 / Chapter 3.1.4 --- Cache Prefetching Support --- p.47 / Chapter 4 --- Proposed Algorithms --- p.48 / Chapter 4.1 --- SIRPA --- p.48 / Chapter 4.1.1 --- Rationale --- p.48 / Chapter 4.1.2 --- Architecture Model --- p.50 / Chapter 4.2 --- Line Concept --- p.56 / Chapter 4.2.1 --- Rationale --- p.56 / Chapter 4.2.2 --- "Improvement Over ""Pure"" Algorithm" --- p.57 / Chapter 4.2.3 --- Architectural Model --- p.59 / Chapter 4.3 --- Combined L1-L2 Cache Management --- p.62 / Chapter 4.3.1 --- Rationale --- p.62 / Chapter 4.3.2 --- Feasibility --- p.63 / Chapter 4.4 --- Combine SIRPA with Default Prefetch --- p.66 / Chapter 4.4.1 --- Rationale --- p.67 / Chapter 4.4.2 --- Improvement Over “Pure´ح Algorithm --- p.69 / Chapter 4.4.3 --- Architectural Model --- p.70 / Chapter 5 --- Results --- p.73 / Chapter 5.1 --- Benchmarks Used --- p.73 / Chapter 5.1.1 --- SPEC92int and SPEC92fp --- p.75 / Chapter 5.2 --- Configurations Tested --- p.79 / Chapter 5.2.1 --- Prefetch Algorithms --- p.79 / Chapter 5.2.2 --- Cache Sizes --- p.80 / Chapter 5.2.3 --- Cache Block Sizes --- p.81 / Chapter 5.2.4 --- Cache Set Associativities --- p.81 / Chapter 5.2.5 --- "Bus Width, Speed and Other Parameters" --- p.81 / Chapter 5.3 --- Validity of Results --- p.83 / Chapter 5.3.1 --- Total Instructions and Cycles --- p.83 / Chapter 5.3.2 --- Total Reference to Caches --- p.84 / Chapter 5.4 --- Overall MCPI Comparison --- p.86 / Chapter 5.4.1 --- Cache Size Effect --- p.87 / Chapter 5.4.2 --- Cache Block Size Effect --- p.91 / Chapter 5.4.3 --- Set Associativity Effect --- p.101 / Chapter 5.4.4 --- Hardware Prefetch Algorithms --- p.108 / Chapter 5.4.5 --- Software Based Prefetch Algorithms --- p.119 / Chapter 5.5 --- L2 Cache & Main Memory MCPI Comparison --- p.127 / Chapter 5.5.1 --- Cache Size Effect --- p.130 / Chapter 5.5.2 --- Cache Block Size Effect --- p.130 / Chapter 5.5.3 --- Set Associativity Effect --- p.143 / Chapter 6 --- Conclusion --- p.154 / Chapter 7 --- Future Directions --- p.157 / Chapter 7.1 --- Prefetch Buffer --- p.157 / Chapter 7.2 --- Dissimilar L1-L2 Management --- p.158 / Chapter 7.3 --- Combined LRU/MRU Replacement Policy --- p.160 / Chapter 7.4 --- N Loops Look-ahead --- p.163 Cache memory Memory management (Computer science) Random access memory
43	Improving on-chip data cache using instruction register information. January 1996 (has links) by Lau Siu Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 71-74). / Abstract --- p.i / Acknowledgment --- p.ii / List of Figures --- p.v / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Hiding memory latency --- p.1 / Chapter 1.2 --- Organization of dissertation --- p.4 / Chapter Chapter 2 --- Related Work --- p.5 / Chapter 2.1 --- Hardware controlled cache prefetching --- p.5 / Chapter 2.2 --- Software assisted cache prefetching --- p.9 / Chapter Chapter 3 --- Data Prefetching --- p.13 / Chapter 3.1 --- Data reference patterns --- p.14 / Chapter 3.2 --- Embedded hints for next data references --- p.19 / Chapter 3.3 --- Instruction Opcode and Addressing Mode Prefetching scheme --- p.21 / Chapter 3.3.1 --- Basic IAP scheme --- p.21 / Chapter 3.3.2 --- Enhanced IAP scheme --- p.24 / Chapter 3.3.3 --- Combined IAP scheme --- p.27 / Chapter 3.4 --- Summary --- p.29 / Chapter Chapter 4 --- Performance Evaluation --- p.31 / Chapter 4.1 --- Evaluation methodology --- p.31 / Chapter 4.1.1 --- Trace-driven simulation --- p.31 / Chapter 4.1.2 --- Caching models --- p.33 / Chapter 4.1.3 --- Benchmarks and metrics --- p.36 / Chapter 4.2 --- General Results --- p.41 / Chapter 4.2.1 --- Varying cache size --- p.44 / Chapter 4.2.2 --- Varying cache block size --- p.46 / Chapter 4.2.3 --- Varying associativity --- p.49 / Chapter 4.3 --- Other performance metrics --- p.52 / Chapter 4.3.1 --- Accuracy of prefetch --- p.52 / Chapter 4.3.2 --- Partial hit delay --- p.55 / Chapter 4.3.3 --- Bus usage problem --- p.59 / Chapter 4.4 --- Zero time prefetch --- p.63 / Chapter 4.5 --- Summary --- p.67 / Chapter Chapter 5 --- Conclusion --- p.68 / Chapter 5.1 --- Summary of our research --- p.68 / Chapter 5.2 --- Future work --- p.70 / Bibliography --- p.71 Cache memory Computer architecture Memory management (Computer science)
44	Techniques of distributed caching and terminal tracking for mobile computing. January 1997 (has links) by Chiu-Fai Fong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 76-81). / Abstract --- p.i / Acknowledgments --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Distributed Data Caching --- p.2 / Chapter 1.2 --- Mobile Terminal Tracking --- p.5 / Chapter 1.3 --- Thesis Overview --- p.10 / Chapter 2 --- Personal Communication Network --- p.11 / Chapter 2.1 --- Network Architecture --- p.11 / Chapter 2.2 --- Resource Limitations --- p.13 / Chapter 2.3 --- Mobility --- p.14 / Chapter 3 --- Distributed Data Caching --- p.17 / Chapter 3.1 --- System Model --- p.18 / Chapter 3.1.1 --- The Wireless Network Environment --- p.18 / Chapter 3.1.2 --- Caching Protocol --- p.19 / Chapter 3.2 --- Caching at Mobile Computers --- p.22 / Chapter 3.3 --- Broadcasting at the Server --- p.24 / Chapter 3.3.1 --- Passive Strategy --- p.27 / Chapter 3.3.2 --- Active Strategy --- p.27 / Chapter 3.4 --- Performance Analysis --- p.29 / Chapter 3.4.1 --- Bandwidth Requirements --- p.29 / Chapter 3.4.2 --- Lower Bound on the Optimal Bandwidth Consumption --- p.30 / Chapter 3.4.3 --- The Read Response Time --- p.32 / Chapter 3.5 --- Experiments --- p.35 / Chapter 3.6 --- Mobility Concerns --- p.42 / Chapter 4 --- Mobile Terminal Tracking --- p.44 / Chapter 4.1 --- Movement Model --- p.45 / Chapter 4.2 --- Optimal Paging --- p.48 / Chapter 4.3 --- Transient Analysis --- p.52 / Chapter 4.3.1 --- The Time-Based Protocol --- p.55 / Chapter 4.3.2 --- Distance-Based Protocol --- p.59 / Chapter 4.4 --- The Reverse-Guessing Protocol --- p.64 / Chapter 4.5 --- Experiments --- p.66 / Chapter 5 --- Conclusions & Future Work --- p.71 / Chapter 5.1 --- Distributed Data Caching --- p.72 / Chapter 5.2 --- Mobile Terminal Tracking --- p.73 / Bibliography --- p.76 / A Proof of NP-hardness of the Broadcast Set Assignment Problem --- p.82 Cache memory Memory management (Computer science) Mobile computing
45	Optimization of instruction memory for embedded systems Janapsatya, Andhi, Computer Science & Engineering, Faculty of Engineering, UNSW January 2005 (has links) This thesis presents methodologies for improving system performance and energy consumption by optimizing the memory hierarchy performance. The processor-memory performance gap is a well-known problem that is predicted to get worse, as the performance gap between processor and memory is widening. The author describes a method to estimate the best L1 cache configuration for a given application. In addition, three methods are presented to improve the performance and reduce energy in embedded systems by optimizing the instruction memory. Performance estimation is an important procedure to assess the performance of the system and to assess the effectiveness of any applied optimizations. A cache memory performance estimation methodology is presented in this thesis. The methodology is designed to quickly and accurately estimate the performance of multiple cache memory configurations. Experimental results showed that the methodology is on average 45 times faster compared to a widely used tool (Dinero IV). The first optimization method is a software-only method, called code placement, was implemented to improve the performance of instruction cache memory. The method involves careful placement of code within memory to ensure high cache hit rate when code is brought into the cache memory. Code placement methodology aims to improve cache hit rates to improve cache memory performance. Experimental results show that by applying the code placement method, a reduction in cache miss rate by up to 71%, and energy consumption reduction of up to 63% are observed when compared to application without code placement. The second method involves a novel architecture for utilizing scratchpad memory. The scratchpad memory is designed as a replacement of the instruction cache memory. Hardware modification was designed to allow data to be written into the scratchpad memory during program execution, allowing dynamic control of the scratchpad memory content. Scratchpad memory has a faster memory access time and a lower energy consumption per access compared to cache memory; the usage of scratchpad memory aims to improve performance and lower energy consumption of systems compared to system with cache memory. Experimental results show an average energy reduction of 26.59% and an average performance improvement of 25.63% when compared to a system with cache memory. The third is an application profiling method using statistical information to identify application???s hot-spots. Application profiling is important for identifying section in the application where performance degradation might occur and/or where maximum performance gain can be obtained through optimization. The method was applied and tested on the scratchpad based system described in this thesis. Experimental results show the effectiveness of the analysis method in reducing energy and improving performance when compared to previous method for utilizing the scratchpad memory based system (average performance improvement of 23.6% and average energy reduction of 27.1% are observed). Memory management (Computer science) Cache memory Computer architecture
46	Optimization of instruction memory for embedded systems Janapsatya, Andhi, Computer Science & Engineering, Faculty of Engineering, UNSW January 2005 (has links) This thesis presents methodologies for improving system performance and energy consumption by optimizing the memory hierarchy performance. The processor-memory performance gap is a well-known problem that is predicted to get worse, as the performance gap between processor and memory is widening. The author describes a method to estimate the best L1 cache configuration for a given application. In addition, three methods are presented to improve the performance and reduce energy in embedded systems by optimizing the instruction memory. Performance estimation is an important procedure to assess the performance of the system and to assess the effectiveness of any applied optimizations. A cache memory performance estimation methodology is presented in this thesis. The methodology is designed to quickly and accurately estimate the performance of multiple cache memory configurations. Experimental results showed that the methodology is on average 45 times faster compared to a widely used tool (Dinero IV). The first optimization method is a software-only method, called code placement, was implemented to improve the performance of instruction cache memory. The method involves careful placement of code within memory to ensure high cache hit rate when code is brought into the cache memory. Code placement methodology aims to improve cache hit rates to improve cache memory performance. Experimental results show that by applying the code placement method, a reduction in cache miss rate by up to 71%, and energy consumption reduction of up to 63% are observed when compared to application without code placement. The second method involves a novel architecture for utilizing scratchpad memory. The scratchpad memory is designed as a replacement of the instruction cache memory. Hardware modification was designed to allow data to be written into the scratchpad memory during program execution, allowing dynamic control of the scratchpad memory content. Scratchpad memory has a faster memory access time and a lower energy consumption per access compared to cache memory; the usage of scratchpad memory aims to improve performance and lower energy consumption of systems compared to system with cache memory. Experimental results show an average energy reduction of 26.59% and an average performance improvement of 25.63% when compared to a system with cache memory. The third is an application profiling method using statistical information to identify application???s hot-spots. Application profiling is important for identifying section in the application where performance degradation might occur and/or where maximum performance gain can be obtained through optimization. The method was applied and tested on the scratchpad based system described in this thesis. Experimental results show the effectiveness of the analysis method in reducing energy and improving performance when compared to previous method for utilizing the scratchpad memory based system (average performance improvement of 23.6% and average energy reduction of 27.1% are observed). Memory management (Computer science) Cache memory Computer architecture
47	Optimization of instruction memory for embedded systems Janapsatya, Andhi, Computer Science & Engineering, Faculty of Engineering, UNSW January 2005 (has links) This thesis presents methodologies for improving system performance and energy consumption by optimizing the memory hierarchy performance. The processor-memory performance gap is a well-known problem that is predicted to get worse, as the performance gap between processor and memory is widening. The author describes a method to estimate the best L1 cache configuration for a given application. In addition, three methods are presented to improve the performance and reduce energy in embedded systems by optimizing the instruction memory. Performance estimation is an important procedure to assess the performance of the system and to assess the effectiveness of any applied optimizations. A cache memory performance estimation methodology is presented in this thesis. The methodology is designed to quickly and accurately estimate the performance of multiple cache memory configurations. Experimental results showed that the methodology is on average 45 times faster compared to a widely used tool (Dinero IV). The first optimization method is a software-only method, called code placement, was implemented to improve the performance of instruction cache memory. The method involves careful placement of code within memory to ensure high cache hit rate when code is brought into the cache memory. Code placement methodology aims to improve cache hit rates to improve cache memory performance. Experimental results show that by applying the code placement method, a reduction in cache miss rate by up to 71%, and energy consumption reduction of up to 63% are observed when compared to application without code placement. The second method involves a novel architecture for utilizing scratchpad memory. The scratchpad memory is designed as a replacement of the instruction cache memory. Hardware modification was designed to allow data to be written into the scratchpad memory during program execution, allowing dynamic control of the scratchpad memory content. Scratchpad memory has a faster memory access time and a lower energy consumption per access compared to cache memory; the usage of scratchpad memory aims to improve performance and lower energy consumption of systems compared to system with cache memory. Experimental results show an average energy reduction of 26.59% and an average performance improvement of 25.63% when compared to a system with cache memory. The third is an application profiling method using statistical information to identify application???s hot-spots. Application profiling is important for identifying section in the application where performance degradation might occur and/or where maximum performance gain can be obtained through optimization. The method was applied and tested on the scratchpad based system described in this thesis. Experimental results show the effectiveness of the analysis method in reducing energy and improving performance when compared to previous method for utilizing the scratchpad memory based system (average performance improvement of 23.6% and average energy reduction of 27.1% are observed). Memory management (Computer science) Cache memory Computer architecture
48	Multifractal analysis of memory usage patterns Crowell, Jonathan B. January 2001 (has links) Thesis (M.S.)--West Virginia University, 2001. / Title from document title page. Document formatted into pages; contains vii, 47 p. : ill. Includes abstract. Includes bibliographical references (p. 45-47).
49	Memory management and transaction scheduling for large-scale databases / Sinha, Aman, January 1999 (has links) Thesis (Ph. D.)--University of Texas at Austin, 1999. / Vita. Includes bibliographical references (leaves 125-135). Available also in a digital version from Dissertation Abstracts.
50	Regions and control / Semmelroth, Miley Edward, January 2003 (has links) Thesis (Ph. D.)--University of Oregon, 2003. / Typescript. Includes vita and abstract. Includes bibliographical references (leaves 137-141). Also available for download via the World Wide Web; free to University of Oregon users.

Search results