Spelling suggestions: "subject:"cache."" "subject:"vache.""
211 |
Software-assisted data prefetching algorithms.January 1995 (has links)
by Chi-sum, Ho. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1995. / Includes bibliographical references (leaves 110-113). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Cache Memories --- p.1 / Chapter 1.3 --- Improving Cache Performance --- p.3 / Chapter 1.4 --- Improving System Performance --- p.4 / Chapter 1.5 --- Organization of the dissertation --- p.6 / Chapter 2 --- Related Work --- p.8 / Chapter 2.1 --- Cache Performance --- p.8 / Chapter 2.2 --- Non-Blocking Cache --- p.9 / Chapter 2.3 --- Cache Prefetching --- p.10 / Chapter 2.3.1 --- Hardware Prefetching --- p.10 / Chapter 2.3.2 --- Software-assisted Prefetching --- p.13 / Chapter 2.3.3 --- Improving Cache Effectiveness --- p.22 / Chapter 2.4 --- Other Techniques to Reduce and Hide Memory Latencies --- p.25 / Chapter 2.4.1 --- Register Preloading --- p.25 / Chapter 2.4.2 --- Write Policies --- p.26 / Chapter 2.4.3 --- Small Specialized Cache --- p.26 / Chapter 2.4.4 --- Program Transformation --- p.27 / Chapter 3 --- Stride CAM Prefetching --- p.30 / Chapter 3.1 --- Introduction --- p.30 / Chapter 3.2 --- Architectural Model --- p.32 / Chapter 3.2.1 --- Compiler Support --- p.33 / Chapter 3.2.2 --- Hardware Support --- p.35 / Chapter 3.2.3 --- Model Details --- p.39 / Chapter 3.3 --- Optimization Issues --- p.39 / Chapter 3.3.1 --- Eliminating Reductant Prefetching --- p.40 / Chapter 3.3.2 --- Code Motion --- p.40 / Chapter 3.3.3 --- Burst Mode --- p.44 / Chapter 3.3.4 --- Stride CAM Overflow --- p.45 / Chapter 3.3.5 --- Effects of Loop Optimizations --- p.46 / Chapter 3.4 --- Practicability --- p.50 / Chapter 3.4.1 --- Evaluation Methodology --- p.51 / Chapter 3.4.2 --- Prefetch Accuracy --- p.54 / Chapter 3.4.3 --- Stride CAM Size --- p.56 / Chapter 3.4.4 --- Software Overhead --- p.60 / Chapter 4 --- Stride Register Prefetching --- p.67 / Chapter 4.1 --- Motivation --- p.67 / Chapter 4.2 --- Architectural Model --- p.67 / Chapter 4.2.1 --- Stride Register --- p.69 / Chapter 4.2.2 --- Compiler Support --- p.70 / Chapter 4.2.3 --- Prefetch Bits --- p.72 / Chapter 4.2.4 --- Operation Details --- p.77 / Chapter 4.3 --- Practicability and Optimizations --- p.78 / Chapter 4.3.1 --- Practicability on NASA7 Benchmark Programs --- p.78 / Chapter 4.3.2 --- Optimization Issues --- p.81 / Chapter 4.4 --- Comparison Between Stride CAM and Stride Register Models --- p.84 / Chapter 5 --- Small Software-Driven Array Cache --- p.87 / Chapter 5.1 --- Introduction --- p.87 / Chapter 5.2 --- Cache Pollution in MXM --- p.88 / Chapter 5.3 --- Architectural Model --- p.89 / Chapter 5.3.1 --- Operation Details --- p.91 / Chapter 5.4 --- Effectiveness of Array Cache --- p.92 / Chapter 6 --- Conclusion --- p.96 / Chapter 6.1 --- Conclusion --- p.96 / Chapter 6.2 --- Future Research: An Extension of the Stride CAM Model --- p.97 / Chapter 6.2.1 --- Background --- p.97 / Chapter 6.2.2 --- Reference Address Series --- p.98 / Chapter 6.2.3 --- Extending the Stride CAM Model --- p.100 / Chapter 6.2.4 --- Prefetch Overhead --- p.109 / Bibliography --- p.110 / Appendix --- p.114 / Chapter A --- Simulation Results - Stride CAM Model --- p.114 / Chapter A.l --- Execution Time --- p.114 / Chapter A.1.1 --- BTRIX --- p.114 / Chapter A.1.2 --- CFFT2D --- p.115 / Chapter A.1.3 --- CHOLSKY --- p.116 / Chapter A.1.4 --- EMIT --- p.117 / Chapter A.1.5 --- GMTRY --- p.118 / Chapter A.1.6 --- MXM --- p.119 / Chapter A.1.7 --- VPENTA --- p.120 / Chapter A.2 --- Memory Delay --- p.122 / Chapter A.2.1 --- BTRIX --- p.122 / Chapter A.2.2 --- CFFT2D --- p.123 / Chapter A.2.3 --- CHOLSKY --- p.124 / Chapter A.2.4 --- EMIT --- p.125 / Chapter A.2.5 --- GMTRY --- p.126 / Chapter A.2.6 --- MXM --- p.127 / Chapter A.2.7 --- VPENTA --- p.128 / Chapter A.3 --- Overhead --- p.129 / Chapter A.3.1 --- BTRIX --- p.129 / Chapter A.3.2 --- CFFT2D --- p.130 / Chapter A.3.3 --- CHOLSKY --- p.131 / Chapter A.3.4 --- EMIT --- p.132 / Chapter A.3.5 --- GMTRY --- p.133 / Chapter A.3.6 --- MXM --- p.134 / Chapter A.3.7 --- VPENTA --- p.135 / Chapter A.4 --- Hit Ratio --- p.136 / Chapter A.4.1 --- BTRIX --- p.136 / Chapter A.4.2 --- CFFT2D --- p.137 / Chapter A.4.3 --- CHOLSKY --- p.137 / Chapter A.4.4 --- EMIT --- p.138 / Chapter A.4.5 --- GMTRY --- p.139 / Chapter A.4.6 --- MXM --- p.139 / Chapter A.4.7 --- VPENTA --- p.140 / Chapter B --- Simulation Results - Array Cache --- p.141 / Chapter C --- NASA7 Benchmark --- p.145 / Chapter C.1 --- BTRIX --- p.145 / Chapter C.2 --- CFFT2D --- p.161 / Chapter C.2.1 --- cfft2dl --- p.161 / Chapter C.2.2 --- cfft2d2 --- p.169 / Chapter C.3 --- CHOLSKY --- p.179 / Chapter C.4 --- EMIT --- p.192 / Chapter C.5 --- GMTRY --- p.205 / Chapter C.6 --- MXM --- p.217 / Chapter C.7 --- VPENTA --- p.220
|
212 |
Data prefetching using hardware register value predictable table.January 1996 (has links)
by Chin-Ming, Cheung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 95-97). / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Objective --- p.3 / Chapter 1.3 --- Organization of the dissertation --- p.4 / Chapter 2 --- Related Works --- p.6 / Chapter 2.1 --- Previous Cache Works --- p.6 / Chapter 2.2 --- Data Prefetching Techniques --- p.7 / Chapter 2.2.1 --- Hardware Vs Software Assisted --- p.7 / Chapter 2.2.2 --- Non-selective Vs Highly Selective --- p.8 / Chapter 2.2.3 --- Summary on Previous Data Prefetching Schemes --- p.12 / Chapter 3 --- Program Data Mapping --- p.13 / Chapter 3.1 --- Regular and Irregular Data Access --- p.13 / Chapter 3.2 --- Propagation of Data Access Regularity --- p.16 / Chapter 3.2.1 --- Data Access Regularity in High Level Program --- p.17 / Chapter 3.2.2 --- Data Access Regularity in Machine Code --- p.18 / Chapter 3.2.3 --- Data Access Regularity in Memory Address Sequence --- p.20 / Chapter 3.2.4 --- Implication --- p.21 / Chapter 4 --- Register Value Prediction Table (RVPT) --- p.22 / Chapter 4.1 --- Predictability of Register Values --- p.23 / Chapter 4.2 --- Register Value Prediction Table --- p.26 / Chapter 4.3 --- Control Scheme of RVPT --- p.29 / Chapter 4.3.1 --- Details of RVPT Mechanism --- p.29 / Chapter 4.3.2 --- Explanation of the Register Prediction Mechanism --- p.32 / Chapter 4.4 --- Examples of RVPT --- p.35 / Chapter 4.4.1 --- Linear Array Example --- p.35 / Chapter 4.4.2 --- Linked List Example --- p.36 / Chapter 5 --- Program Register Dependency --- p.39 / Chapter 5.1 --- Register Dependency --- p.40 / Chapter 5.2 --- Generalized Concept of Register --- p.44 / Chapter 5.2.1 --- Cyclic Dependent Register(CDR) --- p.44 / Chapter 5.2.2 --- Acyclic Dependent Register(ADR) --- p.46 / Chapter 5.3 --- Program Register Overview --- p.47 / Chapter 6 --- Generalized RVPT Model --- p.49 / Chapter 6.1 --- Level N RVPT Model --- p.49 / Chapter 6.1.1 --- Identification of Level N CDR --- p.51 / Chapter 6.1.2 --- Recording CDR instructions of Level N CDR --- p.53 / Chapter 6.1.3 --- Prediction of Level N CDR --- p.55 / Chapter 6.2 --- Level 2 Register Value Prediction Table --- p.55 / Chapter 6.2.1 --- Level 2 RVPT Structure --- p.56 / Chapter 6.2.2 --- Identification of Level 2 CDR --- p.58 / Chapter 6.2.3 --- Control Scheme of Level 2 RVPT --- p.59 / Chapter 6.2.4 --- Example of Index Array --- p.63 / Chapter 7 --- Performance Evaluation --- p.66 / Chapter 7.1 --- Evaluation Methodology --- p.66 / Chapter 7.1.1 --- Trace-Drive Simulation --- p.66 / Chapter 7.1.2 --- Architectural Method --- p.68 / Chapter 7.1.3 --- Benchmarks and Metrics --- p.70 / Chapter 7.2 --- General Result --- p.75 / Chapter 7.2.1 --- Constant Stride or Regular Data Access Applications --- p.77 / Chapter 7.2.2 --- Non-constant Stride or Irregular Data Access Applications --- p.79 / Chapter 7.3 --- Effect of Design Variations --- p.80 / Chapter 7.3.1 --- Effect of Cache Size --- p.81 / Chapter 7.3.2 --- Effect of Block Size --- p.83 / Chapter 7.3.3 --- Effect of Set Associativity --- p.86 / Chapter 7.4 --- Summary --- p.87 / Chapter 8 --- Conclusion and Future Research --- p.88 / Chapter 8.1 --- Conclusion --- p.88 / Chapter 8.2 --- Future Research --- p.90 / Bibliography --- p.95 / Appendix --- p.98 / Chapter A --- MCPI vs. cache size --- p.98 / Chapter B --- MCPI Reduction Percentage Vs cache size --- p.102 / Chapter C --- MCPI vs. block size --- p.106 / Chapter D --- MCPI Reduction Percentage Vs block size --- p.110 / Chapter E --- MCPI vs. set-associativity --- p.114 / Chapter F --- MCPI Reduction Percentage Vs set-associativity --- p.118
|
213 |
Replacement and placement policies for prefetched lines.January 1998 (has links)
by Sze Siu Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 119-122). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overlapping Computations with Memory Accesses --- p.3 / Chapter 1.2 --- Cache Line Replacement Policies --- p.4 / Chapter 1.3 --- The Rest of This Paper --- p.4 / Chapter 2 --- A Brief Review of IAP Scheme --- p.6 / Chapter 2.1 --- Embedded Hints for Next Data References --- p.6 / Chapter 2.2 --- Instruction Opcode and Addressing Mode Prefetching --- p.8 / Chapter 2.3 --- Chapter Summary --- p.9 / Chapter 3 --- Motivation --- p.11 / Chapter 3.1 --- Chapter Summary --- p.14 / Chapter 4 --- Related Work --- p.15 / Chapter 4.1 --- Existing Replacement Algorithms --- p.16 / Chapter 4.2 --- Placement Policies for Cache Lines --- p.18 / Chapter 4.3 --- Chapter Summary --- p.20 / Chapter 5 --- Replacement and Placement Policies of Prefetched Lines --- p.21 / Chapter 5.1 --- IZ Cache Line Replacement Policy in IAP scheme --- p.22 / Chapter 5.1.1 --- The Instant Zero Scheme --- p.23 / Chapter 5.2 --- Priority Pre-Updating and Victim Cache --- p.27 / Chapter 5.2.1 --- Priority Pre-Updating --- p.27 / Chapter 5.2.2 --- Priority Pre-Updating for Cache --- p.28 / Chapter 5.2.3 --- Victim Cache for Unreferenced Prefetch Lines --- p.28 / Chapter 5.3 --- Prefetch Cache for IAP Lines --- p.31 / Chapter 5.4 --- Chapter Summary --- p.33 / Chapter 6 --- Performance Evaluation --- p.34 / Chapter 6.1 --- Methodology and metrics --- p.34 / Chapter 6.1.1 --- Trace Driven Simulation --- p.35 / Chapter 6.1.2 --- Caching Models --- p.36 / Chapter 6.1.3 --- Simulation Models and Performance Metrics --- p.39 / Chapter 6.2 --- Simulation Results --- p.43 / Chapter 6.2.1 --- General Results --- p.44 / Chapter 6.3 --- Simulation Results of IZ Replacement Policy --- p.49 / Chapter 6.3.1 --- Analysis To IZ Cache Line Replacement Policy --- p.50 / Chapter 6.4 --- Simulation Results for Priority Pre-Updating with Victim Cache --- p.52 / Chapter 6.4.1 --- PPUVC in Cache with IAP Scheme --- p.52 / Chapter 6.4.2 --- PPUVC in prefetch-on-miss Cache --- p.54 / Chapter 6.5 --- Prefetch Cache --- p.57 / Chapter 6.6 --- Chapter Summary --- p.63 / Chapter 7 --- Architecture Without LOAD-AND-STORE Instructions --- p.64 / Chapter 8 --- Conclusion --- p.66 / Chapter A --- CPI Due to Cache Misses --- p.68 / Chapter A.1 --- Varying Cache Size --- p.68 / Chapter A.1.1 --- Instant Zero Replacement Policy --- p.68 / Chapter A.1.2 --- Priority Pre-Updating with Victim Cache --- p.70 / Chapter A.1.3 --- Prefetch Cache --- p.73 / Chapter A.2 --- Varying Cache Line Size --- p.75 / Chapter A.2.1 --- Instant Zero Replacement Policy --- p.75 / Chapter A.2.2 --- Priority Pre-Updating with Victim Cache --- p.77 / Chapter A.2.3 --- Prefetch Cache --- p.80 / Chapter A.3 --- Varying Cache Set Associative --- p.82 / Chapter A.3.1 --- Instant Zero Replacement Policy --- p.82 / Chapter A.3.2 --- Priority Pre-Updating with Victim Cache --- p.84 / Chapter A.3.3 --- Prefetch Cache --- p.87 / Chapter B --- Simulation Results of IZ Replacement Policy --- p.89 / Chapter B.1 --- Memory Delay Time Reduction --- p.89 / Chapter B.1.1 --- Varying Cache Size --- p.89 / Chapter B.1.2 --- Varying Cache Line Size --- p.91 / Chapter B.1.3 --- Varying Cache Set Associative --- p.93 / Chapter C --- Simulation Results of Priority Pre-Updating with Victim Cache --- p.95 / Chapter C.1 --- PPUVC in IAP Scheme --- p.95 / Chapter C.1.1 --- Memory Delay Time Reduction --- p.95 / Chapter C.2 --- PPUVC in Cache with Prefetch-On-Miss Only --- p.101 / Chapter C.2.1 --- Memory Delay Time Reduction --- p.101 / Chapter D --- Simulation Results of Prefetch Cache --- p.107 / Chapter D.1 --- Memory Delay Time Reduction --- p.107 / Chapter D.1.1 --- Varying Cache Size --- p.107 / Chapter D.1.2 --- Varying Cache Line Size --- p.109 / Chapter D.1.3 --- Varying Cache Set Associative --- p.111 / Chapter D.2 --- Results of the Three Replacement Policies --- p.113 / Chapter D.2.1 --- Varying Cache Size --- p.113 / Chapter D.2.2 --- Varying Cache Line Size --- p.115 / Chapter D.2.3 --- Varying Cache Set Associative --- p.117 / Bibliography --- p.119
|
214 |
Servicios de cache distribuidos para motores de búsqueda webGómez Pantoja, Carlos January 2014 (has links)
Doctor en Ciencias, Mención Computación / Los Motores de Búsqueda Web (WSEs) actuales están formados por cientos de nodos de procesamiento, los cuales están particionados en grupos llamados servicios. Cada servicio lleva a cabo una función específica, entre los que se destacan: (i) Servicio de Front-End; (ii) Servicio de Cache; y (iii) Servicio de Índice. Específicamente, el Servicio de Front-End maneja las consultas de usuario que arriban al WSE, las distribuye entre los otros servicios, espera por los resultados y genera la respuesta final al usuario. La idea clave del Servicio de Cache es reutilizar resultados previamente computados a consultas hechas en el pasado, lo cual reduce la utilización de recursos y las latencias asociadas. Finalmente, el Servicio de Índice utiliza un índice invertido para obtener de manera eficiente los identificadores de documentos que mejor responden la consulta. El presente trabajo de tesis se focaliza en el diseño e implementación de servicios de cache distribuidos eficientes.
Varios aspectos del sistema y el tráfico de consultas deben ser considerados en el diseño de servicios de cache eficientes: (i) distribuciones sesgadas de las consultas de usuario; (ii) nodos que entran y salen de los servicios (de una forma planificada o súbitamente); y (iii) la aparición de consultas en ráfaga. Cualquiera de estos tópicos es un problema importante, ya que (i) genera una asignación de carga desbalanceada entre los nodos; el tópico (ii) impacta en el servicio cuando no se utilizan mecanismos de balance de carga dinámicos, empeorando la asignación desbalanceada de carga y perdiendo información importante ante fallas; y finalmente (iii) puede congestionar o dejar fuera de servicio algunos nodos debido al abrupto incremento en el tráfico experimentado, incluso si se tiene un servicio balanceado. Dada la arquitectura que se emplea en este trabajo, el Servicio de Cache es el más expuesto a los problemas mencionados, poniendo en riesgo la tasa de hit de este servicio clave y el tiempo de respuesta del WSE.
Este trabajo ataca los problemas mencionados anteriormente proponiendo mejoras arquitecturales, tales como un enfoque de balance de carga dinámico para servicios de cache altamente acoplados (desplegados en clusters) basados en Consistent Hashing, y un esquema para monitoreo y distribución de consultas frecuentes. El mecanismo de balance de carga propuesto es una nueva solución al problema de balance de carga en clusters de computadores que corren aplicaciones manejadas por los datos (data-driven). Además, se estudia cómo predecir la aparición de consultas en ráfaga para tomar acciones correctivas antes de que saturen o colapsen algunos nodos. Finalmente, se adopta la idea de un sistema tolerante a fallas para proteger información valiosa obtenida a través del tiempo. La idea fundamental es replicar algunas entradas de cache entre distintos nodos para que sean usados en caso de fallas.
|
215 |
Unified on-chip multi-level cache management scheme using processor opcodes and addressing modes.January 1996 (has links)
by Stephen Siu-ming Wong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 164-170). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Cache Memory --- p.2 / Chapter 1.2 --- System Performance --- p.3 / Chapter 1.3 --- Cache Performance --- p.3 / Chapter 1.4 --- Cache Prefetching --- p.5 / Chapter 1.5 --- Organization of Dissertation --- p.7 / Chapter 2 --- Related Work --- p.8 / Chapter 2.1 --- Memory Hierarchy --- p.8 / Chapter 2.2 --- Cache Memory Management --- p.10 / Chapter 2.2.1 --- Configuration --- p.10 / Chapter 2.2.2 --- Replacement Algorithms --- p.13 / Chapter 2.2.3 --- Write Back Policies --- p.15 / Chapter 2.2.4 --- Cache Miss Types --- p.16 / Chapter 2.2.5 --- Prefetching --- p.17 / Chapter 2.3 --- Locality --- p.18 / Chapter 2.3.1 --- Spatial vs. Temporal --- p.18 / Chapter 2.3.2 --- Instruction Cache vs. Data Cache --- p.20 / Chapter 2.4 --- Why Not a Large L1 Cache? --- p.26 / Chapter 2.4.1 --- Critical Time Path --- p.26 / Chapter 2.4.2 --- Hardware Cost --- p.27 / Chapter 2.5 --- Trend to have L2 Cache On Chip --- p.28 / Chapter 2.5.1 --- Examples --- p.29 / Chapter 2.5.2 --- Dedicated L2 Bus --- p.31 / Chapter 2.6 --- Hardware Prefetch Algorithms --- p.32 / Chapter 2.6.1 --- One Block Look-ahead --- p.33 / Chapter 2.6.2 --- Chen's RPT & similar algorithms --- p.34 / Chapter 2.7 --- Software Based Prefetch Algorithm --- p.38 / Chapter 2.7.1 --- Prefetch Instruction --- p.38 / Chapter 2.8 --- Hybrid Prefetch Algorithm --- p.40 / Chapter 2.8.1 --- Stride CAM Prefetching --- p.40 / Chapter 3 --- Simulator --- p.43 / Chapter 3.1 --- Multi-level Memory Hierarchy Simulator --- p.43 / Chapter 3.1.1 --- Multi-level Memory Support --- p.45 / Chapter 3.1.2 --- Non-blocking Cache --- p.45 / Chapter 3.1.3 --- Cycle-by-cycle Simulation --- p.47 / Chapter 3.1.4 --- Cache Prefetching Support --- p.47 / Chapter 4 --- Proposed Algorithms --- p.48 / Chapter 4.1 --- SIRPA --- p.48 / Chapter 4.1.1 --- Rationale --- p.48 / Chapter 4.1.2 --- Architecture Model --- p.50 / Chapter 4.2 --- Line Concept --- p.56 / Chapter 4.2.1 --- Rationale --- p.56 / Chapter 4.2.2 --- "Improvement Over ""Pure"" Algorithm" --- p.57 / Chapter 4.2.3 --- Architectural Model --- p.59 / Chapter 4.3 --- Combined L1-L2 Cache Management --- p.62 / Chapter 4.3.1 --- Rationale --- p.62 / Chapter 4.3.2 --- Feasibility --- p.63 / Chapter 4.4 --- Combine SIRPA with Default Prefetch --- p.66 / Chapter 4.4.1 --- Rationale --- p.67 / Chapter 4.4.2 --- Improvement Over “Pure´ح Algorithm --- p.69 / Chapter 4.4.3 --- Architectural Model --- p.70 / Chapter 5 --- Results --- p.73 / Chapter 5.1 --- Benchmarks Used --- p.73 / Chapter 5.1.1 --- SPEC92int and SPEC92fp --- p.75 / Chapter 5.2 --- Configurations Tested --- p.79 / Chapter 5.2.1 --- Prefetch Algorithms --- p.79 / Chapter 5.2.2 --- Cache Sizes --- p.80 / Chapter 5.2.3 --- Cache Block Sizes --- p.81 / Chapter 5.2.4 --- Cache Set Associativities --- p.81 / Chapter 5.2.5 --- "Bus Width, Speed and Other Parameters" --- p.81 / Chapter 5.3 --- Validity of Results --- p.83 / Chapter 5.3.1 --- Total Instructions and Cycles --- p.83 / Chapter 5.3.2 --- Total Reference to Caches --- p.84 / Chapter 5.4 --- Overall MCPI Comparison --- p.86 / Chapter 5.4.1 --- Cache Size Effect --- p.87 / Chapter 5.4.2 --- Cache Block Size Effect --- p.91 / Chapter 5.4.3 --- Set Associativity Effect --- p.101 / Chapter 5.4.4 --- Hardware Prefetch Algorithms --- p.108 / Chapter 5.4.5 --- Software Based Prefetch Algorithms --- p.119 / Chapter 5.5 --- L2 Cache & Main Memory MCPI Comparison --- p.127 / Chapter 5.5.1 --- Cache Size Effect --- p.130 / Chapter 5.5.2 --- Cache Block Size Effect --- p.130 / Chapter 5.5.3 --- Set Associativity Effect --- p.143 / Chapter 6 --- Conclusion --- p.154 / Chapter 7 --- Future Directions --- p.157 / Chapter 7.1 --- Prefetch Buffer --- p.157 / Chapter 7.2 --- Dissimilar L1-L2 Management --- p.158 / Chapter 7.3 --- Combined LRU/MRU Replacement Policy --- p.160 / Chapter 7.4 --- N Loops Look-ahead --- p.163
|
216 |
Improving on-chip data cache using instruction register information.January 1996 (has links)
by Lau Siu Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 71-74). / Abstract --- p.i / Acknowledgment --- p.ii / List of Figures --- p.v / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Hiding memory latency --- p.1 / Chapter 1.2 --- Organization of dissertation --- p.4 / Chapter Chapter 2 --- Related Work --- p.5 / Chapter 2.1 --- Hardware controlled cache prefetching --- p.5 / Chapter 2.2 --- Software assisted cache prefetching --- p.9 / Chapter Chapter 3 --- Data Prefetching --- p.13 / Chapter 3.1 --- Data reference patterns --- p.14 / Chapter 3.2 --- Embedded hints for next data references --- p.19 / Chapter 3.3 --- Instruction Opcode and Addressing Mode Prefetching scheme --- p.21 / Chapter 3.3.1 --- Basic IAP scheme --- p.21 / Chapter 3.3.2 --- Enhanced IAP scheme --- p.24 / Chapter 3.3.3 --- Combined IAP scheme --- p.27 / Chapter 3.4 --- Summary --- p.29 / Chapter Chapter 4 --- Performance Evaluation --- p.31 / Chapter 4.1 --- Evaluation methodology --- p.31 / Chapter 4.1.1 --- Trace-driven simulation --- p.31 / Chapter 4.1.2 --- Caching models --- p.33 / Chapter 4.1.3 --- Benchmarks and metrics --- p.36 / Chapter 4.2 --- General Results --- p.41 / Chapter 4.2.1 --- Varying cache size --- p.44 / Chapter 4.2.2 --- Varying cache block size --- p.46 / Chapter 4.2.3 --- Varying associativity --- p.49 / Chapter 4.3 --- Other performance metrics --- p.52 / Chapter 4.3.1 --- Accuracy of prefetch --- p.52 / Chapter 4.3.2 --- Partial hit delay --- p.55 / Chapter 4.3.3 --- Bus usage problem --- p.59 / Chapter 4.4 --- Zero time prefetch --- p.63 / Chapter 4.5 --- Summary --- p.67 / Chapter Chapter 5 --- Conclusion --- p.68 / Chapter 5.1 --- Summary of our research --- p.68 / Chapter 5.2 --- Future work --- p.70 / Bibliography --- p.71
|
217 |
Techniques of distributed caching and terminal tracking for mobile computing.January 1997 (has links)
by Chiu-Fai Fong. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references (leaves 76-81). / Abstract --- p.i / Acknowledgments --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Distributed Data Caching --- p.2 / Chapter 1.2 --- Mobile Terminal Tracking --- p.5 / Chapter 1.3 --- Thesis Overview --- p.10 / Chapter 2 --- Personal Communication Network --- p.11 / Chapter 2.1 --- Network Architecture --- p.11 / Chapter 2.2 --- Resource Limitations --- p.13 / Chapter 2.3 --- Mobility --- p.14 / Chapter 3 --- Distributed Data Caching --- p.17 / Chapter 3.1 --- System Model --- p.18 / Chapter 3.1.1 --- The Wireless Network Environment --- p.18 / Chapter 3.1.2 --- Caching Protocol --- p.19 / Chapter 3.2 --- Caching at Mobile Computers --- p.22 / Chapter 3.3 --- Broadcasting at the Server --- p.24 / Chapter 3.3.1 --- Passive Strategy --- p.27 / Chapter 3.3.2 --- Active Strategy --- p.27 / Chapter 3.4 --- Performance Analysis --- p.29 / Chapter 3.4.1 --- Bandwidth Requirements --- p.29 / Chapter 3.4.2 --- Lower Bound on the Optimal Bandwidth Consumption --- p.30 / Chapter 3.4.3 --- The Read Response Time --- p.32 / Chapter 3.5 --- Experiments --- p.35 / Chapter 3.6 --- Mobility Concerns --- p.42 / Chapter 4 --- Mobile Terminal Tracking --- p.44 / Chapter 4.1 --- Movement Model --- p.45 / Chapter 4.2 --- Optimal Paging --- p.48 / Chapter 4.3 --- Transient Analysis --- p.52 / Chapter 4.3.1 --- The Time-Based Protocol --- p.55 / Chapter 4.3.2 --- Distance-Based Protocol --- p.59 / Chapter 4.4 --- The Reverse-Guessing Protocol --- p.64 / Chapter 4.5 --- Experiments --- p.66 / Chapter 5 --- Conclusions & Future Work --- p.71 / Chapter 5.1 --- Distributed Data Caching --- p.72 / Chapter 5.2 --- Mobile Terminal Tracking --- p.73 / Bibliography --- p.76 / A Proof of NP-hardness of the Broadcast Set Assignment Problem --- p.82
|
218 |
O efeito da largura de Fetch no desempenho das arquiteturas super escalar, trace cache e DTSVLIWFreitas, Christian Daros de 29 October 2003 (has links)
Made available in DSpace on 2016-12-23T14:33:33Z (GMT). No. of bitstreams: 1
dissertacao.pdf: 525748 bytes, checksum: d81fee4d754843c091457bdd3b0ce230 (MD5)
Previous issue date: 2003-10-29 / Superscalar machines fetch multiple scalar instructions per cycle from the instruction cache. However, machines that fetch no more than one instruction per cycle from
the instruction cache, such as Dynamically Trace Scheduled VLIW (DTSVLIW) machines, have shown performance comparable to that of Superscalars. In this paper we present experiments which show that fetching a single instruction from the instruction cache per cycle allows the same performance achieved fetching multiple instructions per cycle thanks to the execution locality present in programs. We also
present the first direct comparison between the Superscalars, Trace Cache and DTSVLIW architectures. Our results show that a DTSVLIW machine capable of executing up to 16 instructions per cycle can perform 21.9% better than a
Superscalar and 6.6% better than a Trace Cache with equivalent hardware. In the comparison between a DTSVLIW machine and an Alpha 21264 machine, we have shown that the DTSVLIW can perform 24,17% better than Alpha using integer programs, and 60,36% better than Alpha using floating point programs. / Máquinas Super Escalares trazem múltiplas instruções escalares da cache de instruções por ciclo. Contudo, máquinas que buscam na cache de instruções apenas
uma instrução escalar por ciclo de relógio têm demonstrado níveis de desempenho comparáveis aos de máquinas Super Escalares, como é o caso de máquinas que seguem a arquitetura Dynamically Trace Scheduled VLIW (DTSVLIW). Neste trabalho, é mostrado através de experimentos que basta trazer uma instrução escalar por ciclo de máquina da cache de instruções para atingir praticamente o
mesmo desempenho obtido trazendo várias instruções por ciclo graças à localidade de execução existente nos programas. Fazemos, também, a primeira comparação
direta entre as arquiteturas Super Escalar, Trace Cache e DTSVLIW. Os resultados dos experimentos mostram que uma máquina DTSVLIW, capaz de executar até 16 instruções por ciclo, tem desempenho 21.9% superior que uma Super Escalar
hipotética e 6.6% superior que uma Trace Cache com hardware equivalente. Quando comparada com uma máquina Alpha 21264, a máquina DTSVLIW apresenta um desempenho 24,17% superior, para os programas inteiros e, 60,36%
superior, para os programas de ponto flutuante do SPEC2000.
|
219 |
Dynamicky zasílané www-stránky / Server driven negotiationMikulka, Pavel January 2006 (has links)
Práce přibližuje základy protokolu HTTP a možnosti využití dynamického zasílání www stránek. První kapitola popisuje protokol HTTP na obecné úrovni, druhá se věnuje dynamicky zasílaným stránkám. Přínosem je ukázka implementace na dvou prakticky využitelných aplikacích. První z nich je automatický rozcestník pro webové sídlo, jehož úkolem je přesměrovat uživatele na nejvhodnější jazykovou verzi v závislosti na hodnotě hlavičky Accept-Language nebo IP adrese a druhou je download platforma pro společnost nabízející zábavní obsah pro mobilní telefony, jež poskytuje uživateli optimální verzi obsahu v závislosti na user-agent hlavičce jeho přístroje.
|
220 |
Vliv geocachingu na cestovní ruch / The influence of geocaching on tourismZábranská, Vendula January 2011 (has links)
This diploma thesis focuses on the influence of geocaching on tourism. First it briefly presents the principle of precise determination of position and coordinate systems, and then it describes the rules and principles of the game. It deals with the influence of geocaching on domestic tourism -- describing current situation, motivation of geocachers and functions of geocaching. It also analyzes the international influence of geocaching based on the comparison of visits of concrete caches by domestic and foreign geocachers. Then it also compares the numbers of caches in the Czech Republic which provide listing in foreign language and so these caches are foreigners friendly. Last chapter focuses on the analysis of several projects which use geocaching as a marketing instrument and tries to provide recommendations and identify mistakes which should be avoided in order to implement a successful project.
|
Page generated in 0.0354 seconds