Global ETD Search

21	Performance impact of programmer-inserted Data Prefetches for irregular access patterns with a case study of FMM VList algorithm Tondon, Abhishek 22 April 2014 (has links) Data Prefetching is a well-known technique to speed up applications wherein hardware prefetchers or compilers speculatively prefetch data into caches closer to the processor to ensure it’s readily available when the processor demands it. Since incorrect speculation leads to prefetching useless data which, in turn, results in wasting memory bandwidth and polluting caches, prefetch mechanisms are usually conservative and prefetch on spotting fairly regular access patterns only. This gives the programmer with a knowledge of application, an opportunity to insert fine-grain software prefetches in the code to clinically prefetch the data that is certain to be demanded but whose access pattern is not too obvious for hardware prefetchers or compiler to detect. In this study, the author demonstrates the performance improvement obtained by such programmer-inserted prefetches with the case study of an FMM (Fast Multipole Method) VList application kernel run with several different configurations. The VList computation requires computing the Hadamard product of matrices. However, the way each node of the octree is stored in the memory, leads to indirect accessing of elements where memory accesses themselves are not sequential but the pointers pointing to those memory locations are still stored sequentially. Since compilers do not insert prefetches for indirect accesses, and to hardware, the access pattern appears random, programmer-inserted prefetching is the only solution for such a case. The author demonstrates the performance gain obtained by employing different prefetching choices in terms of what all structures in the code to prefetch and which level of cache to prefetch those to and also presents an analysis of the impact of different configuration parameters on performance gain. The author shows that there are several prefetching combinations which always bring performance gain without ever hurting the performance, and also identifies prefetching to L1 cache and prefetching all data structures in question, as the best prefetching recommendation for this application kernel. It is shown that this one combination gets the highest performance gain for most run configurations and an average performance gain of 10.14% across all run configurations. / text Data prefetching Software prefetching Programmer-inserted prefetches Irregular access patterns High performance applications FMM VList Performance enhancement
22	Proxy-based prefetching and pushing of web resources / Proxy-baserad prefetching och pushing av web resurser Holm, Jacob January 2016 (has links) The use of WWW is more prevalent now than ever. Latency has a significant impact on the WWW, with higher latencies causing longer loading time of webpages. On the other hand, if we can lower the latency, we will lower the loading time of a webpage. Latencies are often caused by data traveling long distances or through gateways that add additional processing delays to the forwarded packets. In this thesis we evaluate the latency benefits of different algorithms for prefetching and pushing of web resources, from a proxy when the client cache is known. We found that the most beneficial algorithm is a two sequence data mining technique. This algorithm is evaluated on a live system where we improve loading time by approximately 246 ms with only a 27% traffic increase on average. The results were measured by evaluating a large set of clients on Opera Turbo 2, a distributed proxy with knowledge of the client’s cache. We also concluded that by using a more conservative strategy we can push prefetched resources to the client, reducing the client requests by approximately 9.3% without any significant traffic increase between proxy and client. prefetching proxy-based prefetching proxy-based push live prefetch evaluation live push evaluation lower UPL Computer Sciences Datavetenskap (datalogi)
23	A Power Conservation Methodology for Hard Drives by Combining Prefetching Algorithms and Flash Memory Halper, Raymond 01 January 2013 (has links) Computing system power consumption is a concern as it has financial and environmental implications. These concerns will increase in the future due to the current trends in data growth, information availability requirements, and increases in the cost of energy. Data growth is compounded daily because of the accessibility of portable devices, increased connectivity to the Internet, and a trend toward storing information electronically. These three factors also result in an increased demand for the data to be available for access at all times which results in more electronic devices requiring power. As more electricity is required the overall cost of energy increases due to demand and limited resource availability. The environment also suffers as most electricity is generated from fossil fuels which increase emission of carbon dioxide into the atmosphere. In order to reduce the amount of energy required while maintaining data availability researchers have focused on changing how data is accessed from hard drives. Hard drives have been found to consume 10 to 86 percent of a system's energy. Through changing the way data is accessed by implementing multi speed hard drives, algorithms that prefetch, cache, and batch data requests, or by implementing flash drive caches researchers have been able to reduce the energy required from hard drive operation. However, these approaches often result in reduced I/O performance or reduced data availability. This dissertation provides a new method of reducing hard drive energy consumption by implementing a prefetching technique that predicts a chain of future requests based upon previous request observations. The files to be prefetched are given to a caching system which uses a flash memory device for caching. This caching system implements energy sensitive algorithms to optimize the value of files stored in the flash memory device. Through prefetching files the hard drive on a system can be placed in a low power sleep state. This results in reduced power consumption while providing high I/O performance and data availability. Analysis of simulator results confirmed that this new method increased I/O performance and data availability over previous studies while also providing a higher level of energy savings. Out of 30 scenarios, the new method displayed better energy savings in 26 scenarios and better performance in all 30 scenarios over previous studies. The new method also displayed it could achieve results of 50.9 percent less time and 34.6 percent less energy for a workload over previous methodologies. Adaptive Caching Flash Memory Hard Drive Power Conservation Prefetching Computer Sciences
24	Uso de técnicas e informações em algoritmos adaptativos para substituição de páginas. / Use of technics and information on adaptive page replacement algorithms. Silva, Ricardo Leandro Piantola da 19 March 2010 (has links) O desempenho do sistema de memória virtual depende diretamente da qualidade da política de gerência de memória. Estratégias podem ser desenvolvidas para melhorar tal desempenho: uma delas é criar novas políticas de gerência de memória que tenham, ao mesmo tempo, bom desempenho e simplicidade; outra maneira é desenvolver técnicas e incluir informações para auxiliar as políticas já existentes. Este trabalho procura mostrar uma estratégia para auxiliar políticas de substituição com a finalidade de obter bom desempenho em um sistema de gerência de memória, sem a necessidade de alterar o comportamento da política de substituição. Para isso, foi utilizada a técnica de busca antecipada de páginas em conjunto com a informação de frequência de acessos, obtida por meio de um método usado em processamento estatístico de linguagem natural. Os resultados mostram, além do bom desempenho, que a mesma estratégia pode ser adotada em qualquer algoritmo. / The virtual memory system performance depends directly on the quality of the memory management policy. Strategies can be developed to improve such performance: one of them is creating new memory management policies that present, at the same time, simplicity and good performance; another one is developing techniques and include information that will aid the policies that already exist. This paper aims to show a strategy that will aid replacement policies in order to obtain a good performance in a memory management system without changing the replacement policy behavior. To do so, a page prefetching technique along with information about access frequency, obtained through a method used in a statistical natural language processing, was used. The results show, besides the good performance, that the same strategy can be adopted in any algorithm. Adaptive replacement Demand paging Gerência de memória LRU Prefetching Sistemas operacionais Virtual memory
25	Uso de técnicas e informações em algoritmos adaptativos para substituição de páginas. / Use of technics and information on adaptive page replacement algorithms. Ricardo Leandro Piantola da Silva 19 March 2010 (has links) O desempenho do sistema de memória virtual depende diretamente da qualidade da política de gerência de memória. Estratégias podem ser desenvolvidas para melhorar tal desempenho: uma delas é criar novas políticas de gerência de memória que tenham, ao mesmo tempo, bom desempenho e simplicidade; outra maneira é desenvolver técnicas e incluir informações para auxiliar as políticas já existentes. Este trabalho procura mostrar uma estratégia para auxiliar políticas de substituição com a finalidade de obter bom desempenho em um sistema de gerência de memória, sem a necessidade de alterar o comportamento da política de substituição. Para isso, foi utilizada a técnica de busca antecipada de páginas em conjunto com a informação de frequência de acessos, obtida por meio de um método usado em processamento estatístico de linguagem natural. Os resultados mostram, além do bom desempenho, que a mesma estratégia pode ser adotada em qualquer algoritmo. / The virtual memory system performance depends directly on the quality of the memory management policy. Strategies can be developed to improve such performance: one of them is creating new memory management policies that present, at the same time, simplicity and good performance; another one is developing techniques and include information that will aid the policies that already exist. This paper aims to show a strategy that will aid replacement policies in order to obtain a good performance in a memory management system without changing the replacement policy behavior. To do so, a page prefetching technique along with information about access frequency, obtained through a method used in a statistical natural language processing, was used. The results show, besides the good performance, that the same strategy can be adopted in any algorithm. Gerência de memória Sistemas operacionais Adaptive replacement Demand paging LRU Prefetching Virtual memory
26	Data Prefetching via Off-line Learning Wong, Weng Fai 01 1900 (has links) The widely acknowledged performance gap between processors and memory has been the subject of much research. In the Explicitly Parallel Instruction Computing (EPIC) paradigm, the combination of in-order issue and the presence of a large number of parallel function units has further worsen the problem. Prefetching, by hardware, software or a combination of both, has been one of the primary mechanisms to alleviate this problem. In this talk, we will discuss two prefetching mechanisms, one hardware and other software, suitable for implementation in EPIC processors. Both methods rely on the off-line learning of Markovian predictors. In the hardware mechanism, the predictors are loaded into a table that is used by a prefetch engine. We have shown that the method is particularly effective for prefetching into the L2 cache. Our software mechanism which we called predicated prefetch leverages on informing loads. This is used in conjunction with data remapping and offline learning of Markovian predictors. This distinguishes our approach from early software prefetching techniques that only involves static program analysis. Our experiments show that this framework, together with the algorithms used in it, can effectively remove, in the best instance, 30% of the stall cycles due to cache misses. The results also show that the framework performs better than pure hardware stride predictors and has lower bandwidth and instruction overheads than that of pure software approaches. / Singapore-MIT Alliance (SMA) prefetching Markovian predictors EPIC processors data remapping
27	Visibility Based Prefetching With Simulated Annealing Cevikbas, Safak Burak 01 February 2008 (has links) (PDF) Complex urban scene rendering is not feasible without culling invisible geometry before the rendering actually takes place. Visibility culling can be performed on predefined regions of scene where for each region a potential visible set of scene geometry is computed. Rendering cost is reduced since instead of a bigger set only a single PVS which is associated with the region of the viewer is rendered. However, when the viewer leaves a region and enters one of its neighbors, disposing currently loaded PVS and loading the new PVS causes stalls. Prefetching policies are utilized to overcome stalls by loading PVS of a region before the viewer enters it. This study presents a prefetching method for interactive urban walkthroughs. Regions and transitions among them are represented as a graph where the regions are the nodes and transitions are the edges. Groups of nodes are formed according to statistical data of transitions and used as the prefetching policy. Some heuristics for constructing groups of nodes are developed and Simulated Annealing is utilized for constructing optimized groups based on developed heuristics. The proposed method and underlying application of Simulated Annealing are customized for minimizing average transition cost.
28	Accelerating Markov chain Monte Carlo via parallel predictive prefetching Angelino, Elaine Lee 21 October 2014 (has links) We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. This dissertation demonstrates that MCMC inference can be accelerated in a model of parallel computation that uses speculation to predict and complete computational work ahead of when it is known to be useful. By exploiting fast, iterative approximations to the target density, we can speculatively evaluate many potential future steps of the chain in parallel. In Bayesian inference problems, this approach can accelerate sampling from the target distribution, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores. / Engineering and Applied Sciences Computer science Bayesian inference Markov chain Monte Carlo MCMC parallel prefetching speculative execution
29	DRAM-aware prefetching and cache management Lee, Chang Joo, 1975- 11 February 2011 (has links) Main memory system performance is crucial for high performance microprocessors. Even though the peak bandwidth of main memory systems has increased through improvements in the microarchitecture of Dynamic Random Access Memory (DRAM) chips, conventional on-chip memory systems of microprocessors do not fully take advantage of it. This results in underutilization of the DRAM system, in other words, many idle cycles on the DRAM data bus. The main reason for this is that conventional on-chip memory system designs do not fully take into account important DRAM characteristics. Therefore, the high bandwidth of DRAM-based main memory systems cannot be realized and exploited by the processor. This dissertation identifies three major performance-related characteristics that can significantly affect DRAM performance and makes a case for DRAM characteristic-aware on-chip memory system design. We show that on-chip memory resource management policies (such as prefetching, buffer, and cache policies) that are aware of these DRAM characteristics can significantly enhance entire system performance. The key idea of the proposed mechanisms is to send out to the DRAM system useful memory requests that can be serviced with low latency or in parallel with other requests rather than requests that are serviced with high latency or serially. Our evaluations demonstrate that each of the proposed DRAM-aware mechanisms significantly improves performance by increasing DRAM utilization for useful data. We also show that when employed together, the performance benefit of each mechanism is achieved additively: they work synergistically and significantly improve the overall system performance of both single-core and Chip MultiProcessor (CMP) systems. / text Microprocessor Memory system DRAM Dynamic Random Access Memory chips On-chip memory system Prefetching Cache management Buffer
30	Ultra-mobile computing: adapting network protocol and algorithms for smartphones and tablets Sanadhya, Shruti 12 January 2015 (has links) Smartphones and tablets have been growing in popularity. These ultra mobile devices bring in new challenges for efficient network operations because of their mobility, resource constraints and richness of features. There is thus an increasing need to adapt network protocols to these devices and the traffic demands on wireless service providers. This dissertation focuses on identifying design limitations in existing network protocols when operating in ultra mobile environments and developing algorithmic solutions for the same. Our work comprises of three components. The first component identifies the shortcomings of TCP flow control algorithm when operating on resource constrained smartphones and tablets. We then propose an Adaptive Flow Control (AFC) algorithm for TCP that relies not just on the available buffer space but also on the application read-rate at the receiver. The second component of this work looks at network deduplication for mobile devices. With traditional network deduplication (dedup), the dedup source uses only the portion of the cache at the dedup destination that it is aware of. We argue in this work that in a mobile environment, the dedup destination (say the mobile) could have accumulated a much larger cache than what the current dedup source is aware of. In this context, we propose Asymmetric caching, a solution which allows the dedup destination to selectively feedback appropriate portions of its cache to the dedup source with the intent of improving the redundancy elimination efficiency. The third and final component focuses on leveraging network heterogeneity for prefetching on mobile devices. Our analysis of browser history of 24 iPhone users show that URLs do not repeat exactly. Users do show a lot of repetition in the domains they visit but not the particular URL. Additionally, mobile users access web content over diverse network technologies: WiFi and cellular (3G/4G). While data is unlimited over WiFi, users typically have monthly limits on data over the cellular network. In this context, we propose Precog, an action-based prefetching solution to reduce cellular data footprint on smartphones and tablets. Ultra-mobile computing Wireless networks Smartphones Tablets TCP Network deduplication Prefetching

Search results