Global ETD Search

41	Efficient graph algorithm execution on data-parallel architectures Bangalore Lakshminarayana, Nagesh 12 January 2015 (has links) Mechanisms for improving the execution efficiency of graph algorithms on Data-Parallel Architectures were proposed and identified. Execution of graph algorithms on GPGPU architectures, the prevalent data-parallel architectures was considered. Irregular and data dependent accesses in graph algorithms were found to cause significant idle cycles in GPGPU cores. A prefetching mechanism that reduced the amount of idle cycles by prefetching a data-dependent access pattern found in graph algorithms was proposed. Storing prefetches in unused spare registers in addition to storing them in the cache was shown to be more effective by the prefetching mechanism. The design of the cache hierarchy for graph algorithms was explored. First, an exclusive cache hierarchy was shown to be beneficial at the cost of increased traffic; a region based exclusive cache hierarchy was shown to be similar in performance to an exclusive cache hierarchy while reducing on-chip traffic. Second, bypassing cache blocks at both the level one and level two caches was shown to be beneficial. Third, the use of fine-grained memory accesses (or cache sub-blocking) was shown to be beneficial. The combination of cache bypassing and fine-grained memory accesses was shown to be more beneficial than applying the two mechanisms individually. Finally, the impact of different implementation strategies on algorithm performance was evaluated for the breadth first search algorithm using different input graphs and heuristics to identify the best performing implementation for a given input graph were also discussed. Graph algorithms Data-parallel architectures GPGPU architectures Prefetching Cache hierarchy Inclusion property Cache bypass Fine-grained accesses BFS characterization
42	Word Space Models for Web User Clustering and Page Prefetching Sundin, Albin January 2012 (has links) This study evaluates methods for clustering web users via vector space models, for the purpose of web page prefetching for possible applications of server optimization. An experiment using Latent Semantic Analysis (LSA) is deployed to investigate whether LSA can reproduce the encouraging results obtained from previous research with Random Indexing (RI) and a chaos based optimization algorithm (CAS-C). This is not only motivated by LSA being yet another vector space model, but also by a study indicating LSA to outperform RI in a task similar to the web user clustering and prefetching task. The prefetching task was used to verify the applicability of LSA, where both RI and CAS-C have shown promising results. The original data set from the RI web user clustering and prefetching task was modeled using weighted (tf-idf) LSA. Clusters were defined using a common clustering algorithm (k-means). The least scattered cluster configuration for the model was identified by combining an internal validity measure (SSE) and a relative criterion validity measure (SD index). The assumed optimal cluster configuration was used for the web page prefetching task. Precision and recall of the LSA based method is found to be on par with RI and CAS-C, in as much that it solves the web user clustering and web task with similar characteristics as unweighted RI. The hypothesized inherent gains to precision and recall by using LSA was neither confirmed nor conclusively disproved. The effects of different weighting functions for RI are discussed and a number of methodological factors are identified for further research concerning LSA based clustering and prefetching. LSA RI CAS-C Clustering Prefetching Web mining Segmentering Computer Sciences Datavetenskap (datalogi)
43	Adaptive Prefetching and Cache Partitioning for Multicore Processors Selfa Oliver, Vicent 13 November 2018 (has links) El acceso a la memoria principal en los procesadores actuales supone un importante cuello de botella para las prestaciones, dado que los diferentes núcleos compiten por el limitado ancho de banda de memoria, agravando la brecha entre las prestaciones del procesador y las de la memoria principal. Distintas técnicas atacan este problema, siendo las más relevantes el uso de jerarquías de caché multinivel y la prebúsqueda. Las cachés jerárquicas aprovechan la localidad temporal y espacial que en general presentan los programas en el acceso a los datos, para mitigar las enormes latencias de acceso a memoria principal. Para limitar el número de accesos a la memoria DRAM, fuera del chip, los procesadores actuales cuentan con grandes cachés de último nivel (LLC). Para mejorar su utilización y reducir costes, estas cachés suelen compartirse entre todos los núcleos del procesador. Este enfoque mejora significativamente el rendimiento de la mayoría de las aplicaciones en comparación con el uso de cachés privados más pequeños. Compartir la caché, sin embargo, presenta una problema importante: la interferencia entre aplicaciones. La prebúsqueda, por otro lado, trae bloques de datos a las cachés antes de que el procesador los solicite, ocultando la latencia de memoria principal. Desafortunadamente, dado que la prebúsqueda es una técnica especulativa, si no tiene éxito puede contaminar la caché con bloques que no se usarán. Además, las prebúsquedas interfieren con los accesos a memoria normales, tanto los del núcleo que emite las prebúsquedas como los de los demás. Esta tesis se centra en reducir la interferencia entre aplicaciones, tanto en las caché compartidas como en el acceso a la memoria principal. Para reducir la interferencia entre aplicaciones en el acceso a la memoria principal, el mecanismo propuesto en esta disertación regula la agresividad de cada prebuscador, activando o desactivando selectivamente algunos de ellos, dependiendo de su rendimiento individual y de los requisitos de ancho de banda de memoria principal de los otros núcleos. Con respecto a la interferencia en cachés compartidos, esta tesis propone dos técnicas de particionado para la LLC, las cuales otorgan más espacio de caché a las aplicaciones que progresan más lentamente debido a la interferencia entre aplicaciones. La primera propuesta de particionado de caché requiere hardware específico no disponible en procesadores comerciales, por lo que se ha evaluado utilizando un entorno de simulación. La segunda propuesta de particionado de caché presenta una familia de políticas que superan las limitaciones en el número de particiones y en el número de vías de caché disponibles mediante la agrupación de aplicaciones en clústeres y la superposición de particiones de caché, por lo que varias aplicaciones comparten las mismas vías. Dado que se ha implementado utilizando los mecanismos para el particionado de la LLC que presentan algunos procesadores Intel modernos, esta propuesta ha sido evaluada en una máquina real. Los resultados experimentales muestran que el mecanismo de prebúsqueda selectiva propuesto en esta tesis reduce el número de solicitudes de memoria principal en un 20%, cosa que se traduce en mejoras en la equidad del sistema, el rendimiento y el consumo de energía. Por otro lado, con respecto a los esquemas de partición propuestos, en comparación con un sistema sin particiones, ambas propuestas reducen la iniquidad del sistema en un promedio de más del 25%, independientemente de la cantidad de aplicaciones en ejecución, y esta reducción en la injusticia no afecta negativamente al rendimiento. / Accessing main memory represents a major performance bottleneck in current processors, since the different cores compete among them for the limited offchip bandwidth, aggravating even more the so called memory wall. Several techniques have been applied to deal with the core-memory performance gap, with the most preeminent ones being prefetching and hierarchical caching. Hierarchical caches leverage the temporal and spacial locality of the accessed data, mitigating the huge main memory access latencies. To limit the number of accesses to the off-chip DRAM memory, current processors feature large Last Level Caches. These caches are shared between all the cores to improve the utilization of the cache space and reduce cost. This approach significantly improves the performance of most applications compared to using smaller private caches. Cache sharing, however, presents an important shortcoming: the interference between applications. Prefetching, on the other hand, brings data blocks to the caches before they are requested, hiding the main memory latency. Unfortunately, since prefetching is a speculative technique, inaccurate prefetches may pollute the cache with blocks that will not be used. In addition, the prefetches interfere with the regular memory requests, both the ones from the application running on the core that issued the prefetches and the others. This thesis focuses on reducing the inter-application interference, both in the shared cache and in the access to the main memory. To reduce the interapplication interference in the access to main memory, the proposed approach regulates the aggressiveness of each core prefetcher, and selectively activates or deactivates some of them, depending on their individual performance and the main memory bandwidth requirements of the other cores. With respect to interference in shared caches, this thesis proposes two LLC partitioning techniques that give more cache space to the applications that have their progress diminished due inter-application interferences. The first cache partitioning proposal requires dedicated hardware not available in commercial processors, so it has been evaluated using a simulation framework. The second proposal dealing with cache partitioning presents a family of partitioning policies that overcome the limitations in the number of partitions and the number of available ways by grouping applications and overlapping cache partitions, so multiple applications share the same ways. Since it has been implemented using the cache partitioning features of modern Intel processors it has been evaluated in a real machine. Experimental results show that the proposed selective prefetching mechanism reduces the number of main memory requests by 20%, which translates to improvements in unfairness, performance, and energy consumption. On the other hand, regarding the proposed partitioning schemes, compared to a system with no partitioning, both reduce unfairness more than 25% on average, regardless of the number of applications running in the multicore, and this reduction in unfairness does not negatively affect the performance. / L'accés a la memòria principal en els processadors actuals suposa un important coll d'ampolla per a les prestacions, ja que els diferents nuclis competeixen pel limitat ample de banda de memòria, agreujant la bretxa entre les prestacions del processador i les de la memòria principal. Diferents tècniques ataquen aquest problema, sent les més rellevants l'ús de jerarquies de memòria cau multinivell i la prebusca. Les memòries cau jeràrquiques aprofiten la localitat temporal i espacial que en general presenten els programes en l'accés a les dades per mitigar les enormes latències d'accés a memòria principal. Per limitar el nombre d'accessos a la memòria DRAM, fora del xip, els processadors actuals compten amb grans caus d'últim nivell (LLC). Per millorar la seva utilització i reduir costos, aquestes memòries cau solen compartir-se entre tots els nuclis del processador. Aquest enfocament millora significativament el rendiment de la majoria de les aplicacions en comparació amb l'ús de caus privades més menudes. Compartir la memòria cau, no obstant, presenta una problema important: la interferencia entre aplicacions. La prebusca, per altra banda, porta blocs de dades a les memòries cau abans que el processador els sol·licite, ocultant la latència de memòria principal. Desafortunadament, donat que la prebusca és una técnica especulativa, si no té èxit pot contaminar la memòria cau amb blocs que no fan falta. A més, les prebusques interfereixen amb els accessos normals a memòria, tant els del nucli que emet les prebusques com els dels altres. Aquesta tesi es centra en reduir la interferència entre aplicacions, tant en les cau compartides com en l'accés a la memòria principal. Per reduir la interferència entre aplicacions en l'accés a la memòria principal, el mecanismo proposat en aquesta dissertació regula l'agressivitat de cada prebuscador, activant o desactivant selectivament alguns d'ells, en funció del seu rendiment individual i dels requisits d'ample de banda de memòria principal dels altres nuclis. Pel que fa a la interferència en caus compartides, aquesta tesi proposa dues tècniques de particionat per a la LLC, les quals atorguen més espai de memòria cau a les aplicacions que progressen més lentament a causa de la interferència entre aplicacions. La primera proposta per al particionat de memòria cau requereix hardware específic no disponible en processadors comercials, per la qual cosa s'ha avaluat utilitzant un entorn de simulació. La segona proposta de particionat per a memòries cau presenta una família de polítiques que superen les limitacions en el nombre de particions i en el nombre de vies de memòria cau disponibles mitjan¿ cant l'agrupació d'aplicacions en clústers i la superposició de particions de memòria cau, de manera que diverses aplicacions comparteixen les mateixes vies. Atès que s'ha implementat utilitzant els mecanismes per al particionat de la LLC que ofereixen alguns processadors Intel moderns, aquesta proposta s'ha avaluat en una màquina real. Els resultats experimentals mostren que el mecanisme de prebusca selectiva proposat en aquesta tesi redueix el nombre de sol·licituds a la memòria principal en un 20%, cosa que es tradueix en millores en l'equitat del sistema, el rendiment i el consum d'energia. Per altra banda, pel que fa als esquemes de particiónat proposats, en comparació amb un sistema sense particions, ambdues propostes redueixen la iniquitat del sistema en més d'un 25% de mitjana, independentment de la quantitat d'aplicacions en execució, i aquesta reducció en la iniquitat no afecta negativament el rendiment. / Selfa Oliver, V. (2018). Adaptive Prefetching and Cache Partitioning for Multicore Processors [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112423 / TESIS
44	Human Mobility and Application Usage Prediction Algorithms for Mobile Devices Baumann, Paul 19 August 2016 (has links) Mobile devices such as smartphones and smart watches are ubiquitous companions of humans’ daily life. Since 2014, there are more mobile devices on Earth than humans. Mobile applications utilize sensors and actuators of these devices to support individuals in their daily life. In particular, 24% of the Android applications leverage users’ mobility data. For instance, this data allows applications to understand which places an individual typically visits. This allows providing her with transportation information, location-based advertisements, or to enable smart home heating systems. These and similar scenarios require the possibility to access the Internet from everywhere and at any time. To realize these scenarios 83% of the applications available in the Android Play Store require the Internet to operate properly and therefore access it from everywhere and at any time. Mobile applications such as Google Now or Apple Siri utilize human mobility data to anticipate where a user will go next or which information she is likely to access en route to her destination. However, predicting human mobility is a challenging task. Existing mobility prediction solutions are typically optimized a priori for a particular application scenario and mobility prediction task. There is no approach that allows for automatically composing a mobility prediction solution depending on the underlying prediction task and other parameters. This approach is required to allow mobile devices to support a plethora of mobile applications running on them, while each of the applications support its users by leveraging mobility predictions in a distinct application scenario. Mobile applications rely strongly on the availability of the Internet to work properly. However, mobile cellular network providers are struggling to provide necessary cellular resources. Mobile applications generate a monthly average mobile traffic volume that ranged between 1 GB in Asia and 3.7 GB in North America in 2015. The Ericsson Mobility Report Q1 2016 predicts that by the end of 2021 this mobile traffic volume will experience a 12-fold increase. The consequences are higher costs for both providers and consumers and a reduced quality of service due to congested mobile cellular networks. Several countermeasures can be applied to cope with these problems. For instance, mobile applications apply caching strategies to prefetch application content by predicting which applications will be used next. However, existing solutions suffer from two major shortcomings. They either (1) do not incorporate traffic volume information into their prefetching decisions and thus generate a substantial amount of cellular traffic or (2) require a modification of mobile application code. In this thesis, we present novel human mobility and application usage prediction algorithms for mobile devices. These two major contributions address the aforementioned problems of (1) selecting a human mobility prediction model and (2) prefetching of mobile application content to reduce cellular traffic. First, we address the selection of human mobility prediction models. We report on an extensive analysis of the influence of temporal, spatial, and phone context data on the performance of mobility prediction algorithms. Building upon our analysis results, we present (1) SELECTOR – a novel algorithm for selecting individual human mobility prediction models and (2) MAJOR – an ensemble learning approach for human mobility prediction. Furthermore, we introduce population mobility models and demonstrate their practical applicability. In particular, we analyze techniques that focus on detection of wrong human mobility predictions. Among these techniques, an ensemble learning algorithm, called LOTUS, is designed and evaluated. Second, we present EBC – a novel algorithm for prefetching mobile application content. EBC’s goal is to reduce cellular traffic consumption to improve application content freshness. With respect to existing solutions, EBC presents novel techniques (1) to incorporate different strategies for prefetching mobile applications depending on the available network type and (2) to incorporate application traffic volume predictions into the prefetching decisions. EBC also achieves a reduction in application launch time to the cost of a negligible increase in energy consumption. Developing human mobility and application usage prediction algorithms requires access to human mobility and application usage data. To this end, we leverage in this thesis three publicly available data set. Furthermore, we address the shortcomings of these data sets, namely, (1) the lack of ground-truth mobility data and (2) the lack of human mobility data at short-term events like conferences. We contribute with JK2013 and UbiComp Data Collection Campaign (UbiDCC) two human mobility data sets that address these shortcomings. We also develop and make publicly available a mobile application called LOCATOR, which was used to collect our data sets. In summary, the contributions of this thesis provide a step further towards supporting mobile applications and their users. With SELECTOR, we contribute an algorithm that allows optimizing the quality of human mobility predictions by appropriately selecting parameters. To reduce the cellular traffic footprint of mobile applications, we contribute with EBC a novel approach for prefetching of mobile application content by leveraging application usage predictions. Furthermore, we provide insights about how and to what extent wrong and uncertain human mobility predictions can be detected. Lastly, with our mobile application LOCATOR and two human mobility data sets, we contribute practical tools for researchers in the human mobility prediction domain. info:eu-repo/classification/ddc/004 ddc:004
45	Human Mobility and Application Usage Prediction Algorithms for Mobile Devices Baumann, Paul 27 October 2016 (has links) (PDF) Mobile devices such as smartphones and smart watches are ubiquitous companions of humans’ daily life. Since 2014, there are more mobile devices on Earth than humans. Mobile applications utilize sensors and actuators of these devices to support individuals in their daily life. In particular, 24% of the Android applications leverage users’ mobility data. For instance, this data allows applications to understand which places an individual typically visits. This allows providing her with transportation information, location-based advertisements, or to enable smart home heating systems. These and similar scenarios require the possibility to access the Internet from everywhere and at any time. To realize these scenarios 83% of the applications available in the Android Play Store require the Internet to operate properly and therefore access it from everywhere and at any time. Mobile applications such as Google Now or Apple Siri utilize human mobility data to anticipate where a user will go next or which information she is likely to access en route to her destination. However, predicting human mobility is a challenging task. Existing mobility prediction solutions are typically optimized a priori for a particular application scenario and mobility prediction task. There is no approach that allows for automatically composing a mobility prediction solution depending on the underlying prediction task and other parameters. This approach is required to allow mobile devices to support a plethora of mobile applications running on them, while each of the applications support its users by leveraging mobility predictions in a distinct application scenario. Mobile applications rely strongly on the availability of the Internet to work properly. However, mobile cellular network providers are struggling to provide necessary cellular resources. Mobile applications generate a monthly average mobile traffic volume that ranged between 1 GB in Asia and 3.7 GB in North America in 2015. The Ericsson Mobility Report Q1 2016 predicts that by the end of 2021 this mobile traffic volume will experience a 12-fold increase. The consequences are higher costs for both providers and consumers and a reduced quality of service due to congested mobile cellular networks. Several countermeasures can be applied to cope with these problems. For instance, mobile applications apply caching strategies to prefetch application content by predicting which applications will be used next. However, existing solutions suffer from two major shortcomings. They either (1) do not incorporate traffic volume information into their prefetching decisions and thus generate a substantial amount of cellular traffic or (2) require a modification of mobile application code. In this thesis, we present novel human mobility and application usage prediction algorithms for mobile devices. These two major contributions address the aforementioned problems of (1) selecting a human mobility prediction model and (2) prefetching of mobile application content to reduce cellular traffic. First, we address the selection of human mobility prediction models. We report on an extensive analysis of the influence of temporal, spatial, and phone context data on the performance of mobility prediction algorithms. Building upon our analysis results, we present (1) SELECTOR – a novel algorithm for selecting individual human mobility prediction models and (2) MAJOR – an ensemble learning approach for human mobility prediction. Furthermore, we introduce population mobility models and demonstrate their practical applicability. In particular, we analyze techniques that focus on detection of wrong human mobility predictions. Among these techniques, an ensemble learning algorithm, called LOTUS, is designed and evaluated. Second, we present EBC – a novel algorithm for prefetching mobile application content. EBC’s goal is to reduce cellular traffic consumption to improve application content freshness. With respect to existing solutions, EBC presents novel techniques (1) to incorporate different strategies for prefetching mobile applications depending on the available network type and (2) to incorporate application traffic volume predictions into the prefetching decisions. EBC also achieves a reduction in application launch time to the cost of a negligible increase in energy consumption. Developing human mobility and application usage prediction algorithms requires access to human mobility and application usage data. To this end, we leverage in this thesis three publicly available data set. Furthermore, we address the shortcomings of these data sets, namely, (1) the lack of ground-truth mobility data and (2) the lack of human mobility data at short-term events like conferences. We contribute with JK2013 and UbiComp Data Collection Campaign (UbiDCC) two human mobility data sets that address these shortcomings. We also develop and make publicly available a mobile application called LOCATOR, which was used to collect our data sets. In summary, the contributions of this thesis provide a step further towards supporting mobile applications and their users. With SELECTOR, we contribute an algorithm that allows optimizing the quality of human mobility predictions by appropriately selecting parameters. To reduce the cellular traffic footprint of mobile applications, we contribute with EBC a novel approach for prefetching of mobile application content by leveraging application usage predictions. Furthermore, we provide insights about how and to what extent wrong and uncertain human mobility predictions can be detected. Lastly, with our mobile application LOCATOR and two human mobility data sets, we contribute practical tools for researchers in the human mobility prediction domain. Mobilitaet von Menschen Vorhersagen Datenanalyse human mobility application usage prefetching data sets analysis ddc:004 rvk:ST 265 rvk:ST 515 rvk:ST 520
46	Adaptive Prefetching for Visual Data Exploration Doshi, Punit Rameshchandra 31 January 2003 (has links) Loading of data from slow persistent memory (disk storage) to main memory represents a bottleneck for current interactive visual data exploration applications, especially when applied to huge volumnes of data. Semantic caching of queries at the client-side is a recently emerging technology that can significantly improve the performance of such systems, though it may not in all cases fully achieve the near real-time responsiveness required by such interactive applications. We hence propose to augment the semantic caching techniques by applying prefetching. That is, the system predicts the user's next requested data and loads the data into the cache as a background process before the next user request is made. Our experimental studies confirm that prefetching indeed achieves performance improvements for interactive visual data exploration. However, a given prefetching technique is not always able to correctly predict changes in a user's navigation pattern. Especially, as different users may have different navigation patterns, implying that the same strategy might fail for a new user. In this research, we tackle this shortcoming by utilizing the adaptation concept of strategy selection to allow the choice of prefetching strategy to change over time both across as well as within one user session. While other adaptive prefetching research has focused on refining a single strategy, we instead have developed a framework that facilitates strategy selection. For this, we explored various metrics to measure performance of prefetching strategies in action and thus guide the adaptive selection process. This work is the first to study caching and prefetching in the context of visual data exploration. In particular, we have implemented and evaluated our proposed approach within XmdvTool, a free-ware visualization system for visually exploring hierarchical multivariate data. We have tested our technique on real user traces gathered by the logging tool of our system as well as on synthetic user traces. Our results confirm that our adaptive approach improves system performance by selecting a good combination of prefetching strategies that adapts to the user's changing navigation patterns. Adaptive prefetching Semantic caching Hierarchical data exploration Exploratory data analysis Cache memory Image processing Digital techniques Multivariate analysis Data processing
47	Design Space Exploration and Optimization of Embedded Memory Systems Rabbah, Rodric Michel 11 July 2006 (has links) Recent years have witnessed the emergence of microprocessors that are embedded within a plethora of devices used in everyday life. Embedded architectures are customized through a meticulous and time consuming design process to satisfy stringent constraints with respect to performance, area, power, and cost. In embedded systems, the cost of the memory hierarchy limits its ability to play as central a role. This is due to stringent constraints that fundamentally limit the physical size and complexity of the memory system. Ultimately, application developers and system engineers are charged with the heavy burden of reducing the memory requirements of an application. This thesis offers the intriguing possibility that compilers can play a significant role in the automatic design space exploration and optimization of embedded memory systems. This insight is founded upon a new analytical model and novel compiler optimizations that are specifically designed to increase the synergy between the processor and the memory system. The analytical models serve to characterize intrinsic program properties, quantify the impact of compiler optimizations on the memory systems, and provide deep insight into the trade-offs that affect memory system design. Cache architecture Temporal locality Embedded systems Design space exploration Compilers Spatial locality Prefetching Data remapping Embedded computer systems Programming Cache memory Compilers (Computer programs) Memory hierarchy
48	Methods for Creating and Exploiting Data Locality Wallin, Dan January 2006 (has links) The gap between processor speed and memory latency has led to the use of caches in the memory systems of modern computers. Programs must use the caches efficiently and exploit data locality for maximum performance. Multiprocessors, built from many processing units, are becoming commonplace not only in large servers but also in smaller systems such as personal computers. Multiprocessors require careful data locality optimizations since accesses from other processors can lead to invalidations and false sharing cache misses. This thesis explores hardware and software approaches for creating and exploiting temporal and spatial locality in multiprocessors. We propose the capacity prefetching technique, which efficiently reduces the number of cache misses but avoids false sharing by distinguishing between cache lines involved in communication from non-communicating cache lines at run-time. Prefetching techniques often lead to increased coherence and data traffic. The new bundling technique avoids one of these drawbacks and reduces the coherence traffic in multiprocessor prefetchers. This is especially important in snoop-based systems where the coherence bandwidth is a scarce resource. Most of the studies have been performed on advanced scientific algorithms. This thesis demonstrates that a cc-NUMA multiprocessor, with hardware data migration and replication optimizations, efficiently exploits the temporal locality in such codes. We further present a method of parallelizing a multigrid Gauss-Seidel partial differential equation solver, which creates temporal locality at the expense of increased communication. Our conclusion is that on modern chip multiprocessors, it is more important to optimize algorithms for data locality than to avoid communication, since communication can take place using a shared cache. data locality temporal locality spatial locality prefetching cache cache behavior cache coherence snooping protocols partial differential equation shared-memory multiprocessor chip multiprocessor simulation Computer engineering Datorteknik
49	Caching and prefetching for efficient video services in mobile networks / Caching et prefetching pour une livraison plus efficace des contenus vidéo dans les réseaux mobiles Gouta, Ali 15 January 2015 (has links) Les réseaux cellulaires ont connu une croissance phénoménale du trafic alimentée par les nouvelles technologies d'accès cellulaire. Cette croissance est en grande partie tirée par l'émergence du trafic HTTP adaptatif streaming (HAS) comme une nouvelle technologie de diffusion des contenus vidéo. Le principe du HAS est de rendre disponible plusieurs qualités de la même vidéo en ligne et que les clients choisissent la meilleure qualité qui correspond à leur bande passante. Chaque niveau d'encodage est segmenté en des chunks, qui dont la durée varie de 2 à 10 secondes. L'émergence du HAS a introduit des nouvelles contraintes sur les systèmes de livraison des contenus vidéo en particulier sur les systèmes de caches. Dans ce contexte, nous menons une analyse détaillée des données du trafic HAS collecté en France et fournie par le plus grand opérateur de téléphonie mobile du pays. Tout d'abord, nous analysons et modélisons le comportement des clients qui demandent des contenus VoD et live. Ces analyses nous ont permis d'identifier les facteurs qui impactent la performance des systèmes de cache et de proposer un nouveau algorithme de remplacement de contenus qu'on appelle WA-LRU. WA-LRU exploite la localité temporelle des chunks dans le contenu et la connaissance de la charge du trafic dans le réseau afin d'améliorer la performance du cache. Ensuite, nous analysons et modélisons la logique d'adaptation entre les qualités vidéo basés sur des observations empiriques. Nous montrons que le changement fréquent entre les encodages réduit considérablement la performance des systèmes de cache. Dans ce contexte, nous présentons CF-DASH une implémentation libre d'un player DASH qui vise à réduire les changements fréquents entre qualités, assure une bonne QoE des clients et améliore la performance des systèmes de caches. La deuxième partie de la thèse est dédié à la conception, simulation et implémentation d'une solution de préchargement des contenus vidéo sur terminaux mobiles. Nous concevons un système que nous appelons «Central Predictor System (CPsys)" qui prédit le comportement des clients mobiles et leurs consommations des vidéos. Nous évaluons CPSys avec des traces de trafic réel. Enfin, nous développons une preuve de concept de notre solution de préchargement. / Recently, cellular networks have witnessed a phenomenal growth of traffic fueled by new high speed broadband cellular access technologies. This growth is in large part driven by the emergence of the HTTP Adaptive Streaming (HAS) as a new video delivery method. In HAS, several qualities of the same videos are made available in the network so that clients can choose the quality that best fits their bandwidth capacity. This strongly impacts the viewing pattern of the clients, their switching behavior between video qualities, and thus beyond on content delivery systems. In this context, we provide an analysis of a real HAS dataset collected in France and provided by the largest French mobile operator. Firstly, we analyze and model the viewing patterns of VoD and live streaming HAS sessions and we propose a new cache replacement strategy, named WA-LRU. WA-LRU leverages the time locality of video segments within the HAS content. We show that WA-LRU improves the performance of the cache. Second, we analyze and model the adaptation logic between the video qualities based on empirical observations. We show that high switching behaviors lead to sub optimal caching performance, since several versions of the same content compete to be cached. In this context we investigate the benefits of a Cache Friendly HAS system (CF-DASH) which aims at improving the caching efficiency in mobile networks and to sustain the quality of experience of mobile clients. Third, we investigate the mobile video prefetching opportunities. We show that CPSys can achieve high performance as regards prediction correctness and network utilization efficiency. We further show that CPSys outperforms other prefetching schemes from the state of the art. At the end, we provide a proof-of-concept implementation of our prefetching system. Réseaux mobiles Cache transparent Cdn Prédiction Préchargement Réseaux sociaux Streaming adaptatif Mobile networks Adaptive streaming Caching Cdn Prediction Prefetching Social networks
50	Adéquation Algorithme Architecture pour la reconstruction 3D en imagerie médicale TEP Gac, Nicolas 17 July 2008 (has links) (PDF) L'amélioration constante de la résolution dynamique et temporelle des scanners et des méthodes de reconstruction en imagerie médicale, s'accompagne d'un besoin croissant en puissance de calcul. Les accélérations logicielles, algorithmiques et matérielles sont ainsi appelées à réduire le fossé technologique existant entre les systèmes d'acquisition et ceux de reconstruction.<br />Dans ce contexte, une architecture matérielle de rétroprojection 3D en Tomographie à Emission de Positons (TEP) est proposée. Afin de lever le verrou technologique constitué par la forte latence des mémoires externes de type SDRAM, la meilleure Adéquation Algorithme Architecture a été recherchée. Cette architecture a été implémentée sur un SoPC (System on Programmable Chip) et ses performances comparées à celles d'un PC, d'un serveur de calcul et d'une carte graphique. Associée à un module matériel de projection 3D, cette architecture permet de définir une paire matérielle de projection/rétroprojection et de constituer ainsi un système de reconstruction complet. imagerie médicale Tomographie à Emission de Positons TEP problème inverse reconstruction itérative rétroprojection Adéquation Algorithme Architecture accélération matérielle parallélisme FPGA SoPC GPU prefetching cache mémoire

Search results