• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 101
  • 10
  • 8
  • 8
  • 5
  • 5
  • 3
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 181
  • 74
  • 37
  • 36
  • 32
  • 27
  • 26
  • 25
  • 25
  • 22
  • 22
  • 20
  • 16
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
141

REST API vs GraphQL : A literature and experimental study

Andersson, Tobias, Reinholdsson, Håkan January 2021 (has links)
The purpose of this study is to compare the two architectural techniques REST and GraphQL. This thesis will compare the two techniques and what defines them. A literature study and experimental study are carried out by the researchers. Four applications have been developed that include the ability to disable and to enable caching for both technologies to test the performance effect of caching. Earlier work has not covered the effects on caching related to these two frameworks. The literature study results point to that REST services are up to date and GraphQL is a technique with a shorter history, but that has declared growth in the industry and is a well suited choice for example when bandwidth matters in mobile phone applications. In the experimental study the tests showed slightly better results on average for REST API in terms of total response time (ms). Depending on the intended project there are many factors that need to be evaluated before making a decision on which framework to use.
142

Factorisation du rendu de Monte-Carlo fondée sur les échantillons et le débruitage bayésien / Factorization of Monte Carlo rendering based on samples and Bayesian denoising

Boughida, Malik 23 March 2017 (has links)
Le rendu de Monte-Carlo par lancer de rayons est connu depuis longtemps pour être une classe d’algorithmes de choix lorsqu’il s’agit de générer des images de synthèse photo-réalistes. Toutefois, sa nature fondamentalement aléatoire induit un bruit caractéristique dans les images produites. Dans cette thèse, nous mettons en œuvre des algorithmes fondés sur les échantillons de Monte-Carlo et l’inférence bayésienne pour factoriser le calcul du rendu, par le partage d’information entre pixels voisins ou la mise en cache de données précédemment calculées. Dans le cadre du rendu à temps long, en nous fondant sur une technique récente de débruitage en traitement d’images, appelée Non-local Bayes, nous avons développé un algorithme de débruitage collaboratif par patchs, baptisé Bayesian Collaborative Denoising. Celui-ci est conçu pour être adapté aux spécificités du bruit des rendus de Monte-Carlo et aux données supplémentaires qu’on peut obtenir par des statistiques sur les échantillons. Dans un deuxième temps, pour factoriser les calculs de rendus de Monte-Carlo en temps interactif dans un contexte de scène dynamique, nous proposons un algorithme de rendu complet fondé sur le path tracing, appelé Dynamic Bayesian Caching. Une partition des pixels permet un regroupement intelligent des échantillons. Ils sont alors en nombre suffisant pour pouvoir calculer des statistiques sur eux. Ces statistiques sont comparées avec celles stockées en cache pour déterminer si elles doivent remplacer ou enrichir les données existantes. Finalement un débruitage bayésien, inspiré des travaux de la première partie, est appliqué pour améliorer la qualité de l’image. / Monte Carlo ray tracing is known to be a particularly well-suited class of algorithms for photorealistic rendering. However, its fundamentally random nature breeds noise in the generated images. In this thesis, we develop new algorithms based on Monte Carlo samples and Bayesian inference in order to factorize rendering computations, by sharing information across pixels or by caching previous results. In the context of offline rendering, we build upon a recent denoising technique from the image processing community, called Non-local Bayes, to develop a new patch-based collaborative denoising algorithm, named Bayesian Collaborative Denoising. It is designed to be adapted to the specificities of Monte Carlo noise, and uses the additionnal input data that we can get by gathering per-pixel sample statistics. In a second step, to factorize computations of interactive Monte Carlo rendering, we propose a new algorithm based on path tracing, called Dynamic Bayesian Caching. A clustering of pixels enables a smart grouping of many samples. Hence we can compute meaningful statistics on them. These statistics are compared with the ones that are stored in a cache to decide whether the former should replace or be merged with the latter. Finally, a Bayesian denoising, inspired from the works of the first part, is applied to enhance image quality.
143

Optimisation distribuée dans les grands systèmes interconnectés avec ADMM / Distributed optimization in large interconnected systems using ADMM

Abboud, Azary 12 January 2016 (has links)
Cette thèse porte sur la construction des algorithmes distribués pour l’optimisation de la production et du partage de ressources au sein d’un réseau de large dimension. Notamment, on se concentre sur les réseaux électriques et les réseaux cellulaires 5G. On considère dans le cas des réseaux électriques le problème OPF (Optimal Power Flow) dans lequel on vise à faire la gestion et l’optimisation de la production de l’énergie électrique d’une manière distribuée. On se concentre sur une version linéarisée du problème, la DC-OPF (Direct-Current Optimal Power Flow). Comme le problème d’optimisation est convexe dans ce cas, on vise à minimiser le coût de production de l’énergie tout en respectant les limites des lignes de transmission et les contraintes caractéristiques du système. Dans le cas des réseaux cellulaires, on formule un problème de Caching. On a pour but de réduire l’utilisation du backhaul liant les stations de base et le contrôleur du réseau. Les stations de base sont équipées d’une capacité de stockage limitée. Ils visent à trouver d’une manière optimale les fichiers à stocker dans le but de réduire une certaine fonction de coût sur l’utilisation du backhaul et sur le partage des fichiers avec les autres stations de base. L’approche adoptée dans cette thèse consiste à appliquer l’ADMM (Alternating Direction Method of Multipliers), une méthode d’optimisation de manière itérative, à un problème d’optimisation que l’on a préalablement reformulée de façon adéquate. Ce problème permet à la fois de décrire le DC-OPF et le problème de Caching. On démontre la convergence de cette méthode quand elle est appliquée noeud par noeudd’une manière totalement distribuée. Ainsi que dans le cas où le réseau est divisé en plusieurs zones. Ces zones peuvent se chevaucher mais aussi elles peuvent être séparées ou indépendantes. De plus, dans le contexte d’un réseau à zones, on démontre que l’application de l’ADMM d’une manière aléatoire par une seule zone converge aussi vers la solution optimale du problème. / This thesis focuses on the construction of distributed algorithms for optimizing resource production in a large interconnected system. In particular, it focuses on power grid and 5G cellular networks. In the case of power grid networks, we consider the OPF (Optimal Power Flow) problem in which one seeks to manage and optimize the production of electrical energy in a distributed manner. We focus on a linearized version of the problem, the DC-OPF (Direct- Current Optimal Power Flow) problem. This optimization problem is convex; the aim is to minimize the cost of energy generation while respecting the limits of the transmission line and the power flow constraints. In the case of 5G cellular networks, we formulate a caching problem. We aim to offload the backhaul link usage connecting the small bases stations (SBSs) to the central scheduler (CS). The SBSs are equipped with a limited storage capacity. We seek to find the optimal way to store files so as to reduce the cost on the use of backhaul and sharing files with other SBSs. The approach adopted in this thesis is to apply the ADMM (Alternating Direction Method of Multipliers), an optimization method that is applied iteratively, to an optimization problem that we adequately formulated previously. This problem can both describe the DC-OPF problem and the Caching problem. We prove the convergence of the method when applied node by node in a fully distributed manner. Additionally, we prove its convergence in the case where the network is divided into multiple areas or nations that may or may not overlap. Furthermore, in the context of a network with multiple areas, we show that the application of ADMM in a random manner by a single randomly chosen area also converges to the optimal solution of the problem.
144

L’amélioration des performances des systèmes sans fil 5G par groupements adaptatifs des utilisateurs / Performance improvement of 5G Wireless Systems through adaptive grouping of users

Hajri, Salah Eddine 09 April 2018 (has links)
5G est prévu pour s'attaquer, en plus d'une augmentation considérable du volume de trafic, la tâche de connecter des milliards d'appareils avec des exigences de service hétérogènes. Afin de relever les défis de la 5G, nous préconisons une utilisation plus efficace des informations disponibles, avec plus de sensibilisation par rapport aux services et aux utilisateurs, et une expansion de l'intelligence du RAN. En particulier, nous nous concentrons sur deux activateurs clés de la 5G, à savoir le MIMO massif et la mise en cache proactive. Dans le troisième chapitre, nous nous concentrons sur la problématique de l'acquisition de CSI dans MIMO massif en TDD. Pour ce faire, nous proposons de nouveaux schémas de regroupement spatial tels que, dans chaque groupe, une couverture maximale de la base spatiale du signal avec un chevauchement minimal entre les signatures spatiales des utilisateurs est obtenue. Ce dernier permet d'augmenter la densité de connexion tout en améliorant l'efficacité spectrale. MIMO massif en TDD est également au centre du quatrième chapitre. Dans ce cas, en se basant sur les différents taux de vieillissement des canaux sans fil, la périodicité d'estimation de CSI est supplémentaire. Nous le faisons en proposant un exploité comme un degré de liberté supplémentaire. Nous le faisons en proposant une adaptation dynamique de la trame TDD en fonction des temps de cohérence des canaux hétérogènes. Les stations de bases MIMO massif sont capables d'apprendre la meilleure politique d’estimation sur le uplink pour de longues périodes. Comme les changements de canaux résultent principalement de la mobilité de l'appareil, la connaissance de l'emplacement est également incluse dans le processus d'apprentissage. Le problème de planification qui en a résulté a été modélisé comme un POMDP à deux échelles temporelles et des algorithmes efficaces à faible complexité ont été fournis pour le résoudre. Le cinquième chapitre met l'accent sur la mise en cache proactive. Nous nous concentrons sur l'amélioration de l'efficacité énergétique des réseaux dotes de mise en cache en exploitant la corrélation dans les modèles de trafic en plus de la répartition spatiale des demandes. Nous proposons un cadre qui établit un compromis optimal entre la complexité et la véracité dans la modélisation du comportement des utilisateurs grâce à la classification adaptative basée sur la popularité du contenu. Il simplifie également le problème du placement de contenu, ce qui se traduit par un cadre d'allocation de contenu rapidement adaptable et économe en énergie. / 5G is envisioned to tackle, in addition to a considerable increase in traffic volume, the task of connecting billions of devices with heterogeneous service requirements. In order to address the challenges of 5G, we advocate a more efficient use of the available information, with more service and user awareness, and an expansion of the RAN intelligence. In particular, we focus on two key enablers of 5G, namely massive MIMO and proactive caching. In the third chapter, we focus on addressing the bottleneck of CSI acquisition in TDD Massive MIMO. In order to do so, we propose novel spatial grouping schemes such that, in each group, maximum coverage of the signal’s spatial basis with minimum overlapping between user spatial signatures is achieved. The latter enables to increase connection density while improving spectral efficiency. TDD Massive MIMO is also the focus of the fourth chapter. Therein, based on the different rates of wireless channels aging, CSI estimation periodicity is exploited as an additional DoF. We do so by proposing a dynamic adaptation of the TDD frame based on the heterogeneous channels coherence times. The Massive MIMO BSs are enabled to learn the best uplink training policy for long periods. Since channel changes result primarily from device mobility, location awareness is also included in the learning process. The resulting planning problem was modeled as a two-time scale POMDP and efficient low complexity algorithms were provided to solve it. The fifth chapter focuses on proactive caching. We focus on improving the energy efficiency of cache-enabled networks by exploiting the correlation in traffic patterns in addition to the spatial repartition of requests. We propose a framework that strikes the optimal trade-off between complexity and truthfulness in user behavior modeling through adaptive content popularity-based clustering. It also simplifies the problem of content placement, which results in a rapidly adaptable and energy efficient content allocation framework.
145

Optimizing Consensus Protocols with Machine Learning Models : A cache-based approach

Wu, Kun January 2023 (has links)
Distributed systems offer a reliable and scalable solution for tackling massive and complex tasks that cannot be handled by a single computer. However, standard consensus protocols used in such systems often replicate data without considering the workload, leading to unnecessary retransmissions. This thesis proposes using machine learning (ML) to optimize consensus protocols and make them adaptable to recurring workloads. It introduces a cache that encodes frequently-transmitted data between nodes to reduce network traffic. To implement this, the thesis builds a caching layer at all nodes using the decided logs, which represent a consistent view of the application history. The cache can encode and decode incoming log entries to reduce the average message size and improve throughput under limited network bandwidth. The thesis selects an ML-based model that combines various caching policies and adapts to changing access patterns in the workload. Experimental results show that this approach can improve throughput up to 250%, assuming negligible preprocessing overhead. / Distribuerade system erbjuder en pålitlig och skalbar lösning för att hantera massiva och komplexa uppgifter som inte kan hanteras av en enskild dator. Konventionella konsensusprotokoll som används i dessa system replikerar emellertid ofta data utan att ta hänsyn till arbetsbelastningen, vilket leder till överflödig dataöverföring. Denna avhandling föreslår att använda maskinin lärning (ML) för att optimera konsensusprotokoll och göra dem anpassade till återkommande mönster i arbetsbelastningen. Den introducerar en cache som kodar och komprimerar data som ofta överförs mellan noder för att minska nätverkstrafiken. För att implementera detta byggs ett cache baserat på den bestämda loggen på alla noder, som representerar en konsekvent syn på programhistoriken. Cachen kan koda inkommande data för att minska genomsnittlig meddelandestorlek och förbättra genomströmning under begränsad nätverksbandbredd. En ML-baserad modell som kombinerar olika cachningpolicyer och anpassar sig till ändrade åtkomstmönster i arbetsbelastningen används. Experimentella resultat visar att denna metod kan förbättra genomströmningen med 250%, under förutsättning att förbearbetningsöverhuvudet är försumbart.
146

Improved Internet Security Protocols Using Cryptographic One-Way Hash Chains

Alabrah, Amerah 01 January 2014 (has links)
In this dissertation, new approaches that utilize the one-way cryptographic hash functions in designing improved network security protocols are investigated. The proposed approaches are designed to be scalable and easy to implement in modern technology. The first contribution explores session cookies with emphasis on the threat of session hijacking attacks resulting from session cookie theft or sniffing. In the proposed scheme, these cookies are replaced by easily computed authentication credentials using Lamport's well-known one-time passwords. The basic idea in this scheme revolves around utilizing sparse caching units, where authentication credentials pertaining to cookies are stored and fetched once needed, thereby, mitigating computational overhead generally associated with one-way hash constructions. The second and third proposed schemes rely on dividing the one-way hash construction into a hierarchical two-tier construction. Each tier component is responsible for some aspect of authentication generated by using two different hash functions. By utilizing different cryptographic hash functions arranged in two tiers, the hierarchical two-tier protocol (our second contribution) gives significant performance improvement over previously proposed solutions for securing Internet cookies. Through indexing authentication credentials by their position within the hash chain in a multi-dimensional chain, the third contribution achieves improved performance. In the fourth proposed scheme, an attempt is made to apply the one-way hash construction to achieve user and broadcast authentication in wireless sensor networks. Due to known energy and memory constraints, the one-way hash scheme is modified to mitigate computational overhead so it can be easily applied in this particular setting. The fifth scheme tries to reap the benefits of the sparse cache-supported scheme and the hierarchical scheme. The resulting hybrid approach achieves efficient performance at the lowest cost of caching possible. In the sixth proposal, an authentication scheme tailored for the multi-server single sign-on (SSO) environment is presented. The scheme utilizes the one-way hash construction in a Merkle Hash Tree and a hash calendar to avoid impersonation and session hijacking attacks. The scheme also explores the optimal configuration of the one-way hash chain in this particular environment. All the proposed protocols are validated by extensive experimental analyses. These analyses are obtained by running simulations depicting the many scenarios envisioned. Additionally, these simulations are supported by relevant analytical models derived by mathematical formulas taking into consideration the environment under investigation.
147

Estudio, análisis y desarrollo de una red de distribución de contenido y su algoritmo de redirección de usuarios para servicios web y streaming

Molina Moreno, Benjamin 02 September 2013 (has links)
Esta tesis se ha creado en el marco de la línea de investigación de Mecanismos de Distribución de Contenidos en Redes IP, que ha desarrollado su actividad en diferentes proyectos de investigación y en la asignatura ¿Mecanismos de Distribución de Contenidos en Redes IP¿ del programa de doctorado ¿Telecomunicaciones¿ impartido por el Departamento de Comunicaciones de la UPV y, actualmente en el Máster Universitario en Tecnologías, Sistemas y Redes de Comunicación. El crecimiento de Internet es ampliamente conocido, tanto en número de clientes como en tráfico generado. Esto permite acercar a los clientes una interfaz multimedia, donde pueden concurrir datos, voz, video, música, etc. Si bien esto representa una oportunidad de negocio desde múltiples dimensiones, se debe abordar seriamente el aspecto de la escalabilidad, que pretende que el rendimiento medio de un sistema no se vea afectado conforme aumenta el número de clientes o el volumen de información solicitada. El estudio y análisis de la distribución de contenido web y streaming empleando CDNs es el objeto de este proyecto. El enfoque se hará desde una perspectiva generalista, ignorando soluciones de capa de red como IP multicast, así como la reserva de recursos, al no estar disponibles de forma nativa en la infraestructura de Internet. Esto conduce a la introducción de la capa de aplicación como marco coordinador en la distribución de contenido. Entre estas redes, también denominadas overlay networks, se ha escogido el empleo de una Red de Distribución de Contenido (CDN, Content Delivery Network). Este tipo de redes de nivel de aplicación son altamente escalables y permiten un control total sobre los recursos y funcionalidad de todos los elementos de su arquitectura. Esto permite evaluar las prestaciones de una CDN que distribuya contenidos multimedia en términos de: ancho de banda necesario, tiempo de respuesta obtenido por los clientes, calidad percibida, mecanismos de distribución, tiempo de vida al utilizar caching, etc. Las CDNs nacieron a finales de la década de los noventa y tenían como objetivo principal la eliminación o atenuación del denominado efecto flash-crowd, originado por una afluencia masiva de clientes. Actualmente, este tipo de redes está orientando la mayor parte de sus esfuerzos a la capacidad de ofrecer streaming media sobre Internet. Para un análisis minucioso, esta tesis propone un modelo inicial de CDN simplificado, tanto a nivel teórico como práctico. En el aspecto teórico se expone un modelo matemático que permite evaluar analíticamente una CDN. Este modelo introduce una complejidad considerable conforme se introducen nuevas funcionalidades, por lo que se plantea y desarrolla un modelo de simulación que permite por un lado, comprobar la validez del entorno matemático y, por otro lado, establecer un marco comparativo para la implementación práctica de la CDN, tarea que se realiza en la fase final de la tesis. De esta forma, los resultados obtenidos abarcan el ámbito de la teoría, la simulación y la práctica. / Molina Moreno, B. (2013). Estudio, análisis y desarrollo de una red de distribución de contenido y su algoritmo de redirección de usuarios para servicios web y streaming [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/31637
148

Adaptive Prefetching and Cache Partitioning for Multicore Processors

Selfa Oliver, Vicent 13 November 2018 (has links)
El acceso a la memoria principal en los procesadores actuales supone un importante cuello de botella para las prestaciones, dado que los diferentes núcleos compiten por el limitado ancho de banda de memoria, agravando la brecha entre las prestaciones del procesador y las de la memoria principal. Distintas técnicas atacan este problema, siendo las más relevantes el uso de jerarquías de caché multinivel y la prebúsqueda. Las cachés jerárquicas aprovechan la localidad temporal y espacial que en general presentan los programas en el acceso a los datos, para mitigar las enormes latencias de acceso a memoria principal. Para limitar el número de accesos a la memoria DRAM, fuera del chip, los procesadores actuales cuentan con grandes cachés de último nivel (LLC). Para mejorar su utilización y reducir costes, estas cachés suelen compartirse entre todos los núcleos del procesador. Este enfoque mejora significativamente el rendimiento de la mayoría de las aplicaciones en comparación con el uso de cachés privados más pequeños. Compartir la caché, sin embargo, presenta una problema importante: la interferencia entre aplicaciones. La prebúsqueda, por otro lado, trae bloques de datos a las cachés antes de que el procesador los solicite, ocultando la latencia de memoria principal. Desafortunadamente, dado que la prebúsqueda es una técnica especulativa, si no tiene éxito puede contaminar la caché con bloques que no se usarán. Además, las prebúsquedas interfieren con los accesos a memoria normales, tanto los del núcleo que emite las prebúsquedas como los de los demás. Esta tesis se centra en reducir la interferencia entre aplicaciones, tanto en las caché compartidas como en el acceso a la memoria principal. Para reducir la interferencia entre aplicaciones en el acceso a la memoria principal, el mecanismo propuesto en esta disertación regula la agresividad de cada prebuscador, activando o desactivando selectivamente algunos de ellos, dependiendo de su rendimiento individual y de los requisitos de ancho de banda de memoria principal de los otros núcleos. Con respecto a la interferencia en cachés compartidos, esta tesis propone dos técnicas de particionado para la LLC, las cuales otorgan más espacio de caché a las aplicaciones que progresan más lentamente debido a la interferencia entre aplicaciones. La primera propuesta de particionado de caché requiere hardware específico no disponible en procesadores comerciales, por lo que se ha evaluado utilizando un entorno de simulación. La segunda propuesta de particionado de caché presenta una familia de políticas que superan las limitaciones en el número de particiones y en el número de vías de caché disponibles mediante la agrupación de aplicaciones en clústeres y la superposición de particiones de caché, por lo que varias aplicaciones comparten las mismas vías. Dado que se ha implementado utilizando los mecanismos para el particionado de la LLC que presentan algunos procesadores Intel modernos, esta propuesta ha sido evaluada en una máquina real. Los resultados experimentales muestran que el mecanismo de prebúsqueda selectiva propuesto en esta tesis reduce el número de solicitudes de memoria principal en un 20%, cosa que se traduce en mejoras en la equidad del sistema, el rendimiento y el consumo de energía. Por otro lado, con respecto a los esquemas de partición propuestos, en comparación con un sistema sin particiones, ambas propuestas reducen la iniquidad del sistema en un promedio de más del 25%, independientemente de la cantidad de aplicaciones en ejecución, y esta reducción en la injusticia no afecta negativamente al rendimiento. / Accessing main memory represents a major performance bottleneck in current processors, since the different cores compete among them for the limited offchip bandwidth, aggravating even more the so called memory wall. Several techniques have been applied to deal with the core-memory performance gap, with the most preeminent ones being prefetching and hierarchical caching. Hierarchical caches leverage the temporal and spacial locality of the accessed data, mitigating the huge main memory access latencies. To limit the number of accesses to the off-chip DRAM memory, current processors feature large Last Level Caches. These caches are shared between all the cores to improve the utilization of the cache space and reduce cost. This approach significantly improves the performance of most applications compared to using smaller private caches. Cache sharing, however, presents an important shortcoming: the interference between applications. Prefetching, on the other hand, brings data blocks to the caches before they are requested, hiding the main memory latency. Unfortunately, since prefetching is a speculative technique, inaccurate prefetches may pollute the cache with blocks that will not be used. In addition, the prefetches interfere with the regular memory requests, both the ones from the application running on the core that issued the prefetches and the others. This thesis focuses on reducing the inter-application interference, both in the shared cache and in the access to the main memory. To reduce the interapplication interference in the access to main memory, the proposed approach regulates the aggressiveness of each core prefetcher, and selectively activates or deactivates some of them, depending on their individual performance and the main memory bandwidth requirements of the other cores. With respect to interference in shared caches, this thesis proposes two LLC partitioning techniques that give more cache space to the applications that have their progress diminished due inter-application interferences. The first cache partitioning proposal requires dedicated hardware not available in commercial processors, so it has been evaluated using a simulation framework. The second proposal dealing with cache partitioning presents a family of partitioning policies that overcome the limitations in the number of partitions and the number of available ways by grouping applications and overlapping cache partitions, so multiple applications share the same ways. Since it has been implemented using the cache partitioning features of modern Intel processors it has been evaluated in a real machine. Experimental results show that the proposed selective prefetching mechanism reduces the number of main memory requests by 20%, which translates to improvements in unfairness, performance, and energy consumption. On the other hand, regarding the proposed partitioning schemes, compared to a system with no partitioning, both reduce unfairness more than 25% on average, regardless of the number of applications running in the multicore, and this reduction in unfairness does not negatively affect the performance. / L'accés a la memòria principal en els processadors actuals suposa un important coll d'ampolla per a les prestacions, ja que els diferents nuclis competeixen pel limitat ample de banda de memòria, agreujant la bretxa entre les prestacions del processador i les de la memòria principal. Diferents tècniques ataquen aquest problema, sent les més rellevants l'ús de jerarquies de memòria cau multinivell i la prebusca. Les memòries cau jeràrquiques aprofiten la localitat temporal i espacial que en general presenten els programes en l'accés a les dades per mitigar les enormes latències d'accés a memòria principal. Per limitar el nombre d'accessos a la memòria DRAM, fora del xip, els processadors actuals compten amb grans caus d'últim nivell (LLC). Per millorar la seva utilització i reduir costos, aquestes memòries cau solen compartir-se entre tots els nuclis del processador. Aquest enfocament millora significativament el rendiment de la majoria de les aplicacions en comparació amb l'ús de caus privades més menudes. Compartir la memòria cau, no obstant, presenta una problema important: la interferencia entre aplicacions. La prebusca, per altra banda, porta blocs de dades a les memòries cau abans que el processador els sol·licite, ocultant la latència de memòria principal. Desafortunadament, donat que la prebusca és una técnica especulativa, si no té èxit pot contaminar la memòria cau amb blocs que no fan falta. A més, les prebusques interfereixen amb els accessos normals a memòria, tant els del nucli que emet les prebusques com els dels altres. Aquesta tesi es centra en reduir la interferència entre aplicacions, tant en les cau compartides com en l'accés a la memòria principal. Per reduir la interferència entre aplicacions en l'accés a la memòria principal, el mecanismo proposat en aquesta dissertació regula l'agressivitat de cada prebuscador, activant o desactivant selectivament alguns d'ells, en funció del seu rendiment individual i dels requisits d'ample de banda de memòria principal dels altres nuclis. Pel que fa a la interferència en caus compartides, aquesta tesi proposa dues tècniques de particionat per a la LLC, les quals atorguen més espai de memòria cau a les aplicacions que progressen més lentament a causa de la interferència entre aplicacions. La primera proposta per al particionat de memòria cau requereix hardware específic no disponible en processadors comercials, per la qual cosa s'ha avaluat utilitzant un entorn de simulació. La segona proposta de particionat per a memòries cau presenta una família de polítiques que superen les limitacions en el nombre de particions i en el nombre de vies de memòria cau disponibles mitjan¿ cant l'agrupació d'aplicacions en clústers i la superposició de particions de memòria cau, de manera que diverses aplicacions comparteixen les mateixes vies. Atès que s'ha implementat utilitzant els mecanismes per al particionat de la LLC que ofereixen alguns processadors Intel moderns, aquesta proposta s'ha avaluat en una màquina real. Els resultats experimentals mostren que el mecanisme de prebusca selectiva proposat en aquesta tesi redueix el nombre de sol·licituds a la memòria principal en un 20%, cosa que es tradueix en millores en l'equitat del sistema, el rendiment i el consum d'energia. Per altra banda, pel que fa als esquemes de particiónat proposats, en comparació amb un sistema sense particions, ambdues propostes redueixen la iniquitat del sistema en més d'un 25% de mitjana, independentment de la quantitat d'aplicacions en execució, i aquesta reducció en la iniquitat no afecta negativament el rendiment. / Selfa Oliver, V. (2018). Adaptive Prefetching and Cache Partitioning for Multicore Processors [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112423
149

Optimizations In Storage Area Networks And Direct Attached Storage

Dharmadeep, M C 02 1900 (has links)
The thesis consists of three parts. In the first part, we introduce the notion of device-cache-aware schedulers. Modern disk subsystems have many megabytes of memory for various purposes such as prefetching and caching. Current disk scheduling algorithms make decisions oblivious of the underlying device cache algorithms. In this thesis, we propose a scheduler architecture that is aware of underlying device cache. We also describe how the underlying device cache parameters can be automatically deduced and incorporated into the scheduling algorithm. In this thesis, we have only considered adaptive caching algorithms as modern high end disk subsystems are by default configured to use such algorithms. We implemented a prototype for Linux anticipatory scheduler, where we observed, compared with the anticipatory scheduler, upto 3 times improvement in query execution times with Benchw benchmark and upto 10 percent improvement with Postmark benchmark. The second part deals with implementing cooperative caching for the Redhat Global File System. The Redhat Global File System (GFS) is a clustered shared disk file system. The coordination between multiple accesses is through a lock manager. On a read, a lock on the inode is acquired in shared mode and the data is read from the disk. For a write, an exclusive lock on the inode is acquired and data is written to the disk; this requires all nodes holding the lock to write their dirty buffers/pages to disk and invalidate all the related buffers/pages. A DLM (Distributed Lock Manager) is a module that implements the functions of a lock manager. GFS’s DLM has some support for range locks, although it is not being used by GFS. While it is clear that a data sourced from a memory copy is likely to have lower latency, GFS currently reads from the shared disk after acquiring a lock (just as in other designs such as IBM’s GPFS) rather than from remote memory that just recently had the correct contents. The difficulties are mainly due to the circular relationships that can result between GFS and the generic DLM architecture while integrating DLM locking framework with cooperative caching. For example, the page/buffer cache should be accessible from DLM and yet DLM’s generality has to be preserved. The symmetric nature of DLM (including the SMP concurrency model) makes it even more difficult to understand and integrate cooperative caching into it (note that GPFS has an asymmetrical design). In this thesis, we describe the design of a cooperative caching scheme in GFS. To make it more effective, we also have introduced changes to the locking protocol and DLM to handle range locks more efficiently. Experiments with micro benchmarks on our prototype implementation reveal that, reading from a remote node over gigabit Ethernet can be upto 8 times faster than reading from a enterprise class SCSI disk for random disk reads. Our contributions are an integrated design for cooperative caching and lock manager for GFS, devising a novel method to do interval searches and determining when sequential reads from a remote memory perform better than sequential reads from a disk. The third part deals with selecting a primary network partition in a clustered shared disk system, when node/network failures occur. Clustered shared disk file systems like GFS, GPFS use methods that can fail in case of multiple network partitions and also in case of a 2 node cluster. In this thesis, we give an algorithm for fault-tolerant proactive leader election in asynchronous shared memory systems, and later its formal verification. Roughly speaking, a leader election algorithm is proactive if it can tolerate failure of nodes even after a leader is elected, and (stable) leader election happens periodically. This is needed in systems where a leader is required after every failure to ensure the availability of the system and there might be no explicit events such as messages in the (shared memory) system. Previous algorithms like DiskPaxos are not proactive. In our model, individual nodes can fail and reincarnate at any point in time. Each node has a counter which is incremented every period, which is same across all the nodes (modulo a maximum drift). Different nodes can be in different epochs at the same time. Our algorithm ensures that per epoch there can be at most one leader. So if the counter values of some set of nodes match, then there can be at most one leader among them. If the nodes satisfy certain timeliness constraints, then the leader for the epoch with highest counter also becomes the leader for the next epoch (stable property). Our algorithm uses shared memory proportional to the number of processes, the best possible. We also show how our protocol can be used in clustered shared disk systems to select a primary network partition. We have used the state machine approach to represent our protocol in Isabelle HOL logic system and have proved the safety property of the protocol.
150

Hantering av nätverkscache i DNS

Lindqvist, Hans January 2019 (has links)
The Domain Name System, DNS, is a fundamental part in the usability of the Internet, but its caching function is challenged by the increase of address size, number of addresses and automation. Meanwhile, there are limits in the memory capacity of certain devices at the Internet’s edge towards the Internet of Things. This study has taken a closer look at concurrent needs of DNS resolution and considered how DNS is affected by IPv6 address propagation, mobile devices, content delivery networks and web browser functions. The investigation has, in two freely available DNS resolver implementations, searched for the optimal cache memory management in constrained devices on, or at the border of, the Internet of Things. By means of open source access to the programs, Unbound and PowerDNS Recursor, each of their structures have been interpreted in order to approximate and compare memory requirements. Afterwards a laboratory simulation has been made using fictitious DNS data with real-world characteristics to measure the actual memory consumption at the server process. The simulation avoided individual adaption of program settings, involvement of DNSSEC data and imposing memory constraints on the test environment. The source code analysis estimated that Unbound handled A+AAAA records more optimally while PowerDNS Recursor was more efficient for PTR records. When using both record types as a whole the measurements in the simulation showed that Unbound was able to store DNS data more densely than PowerDNS Recursor. The result has shown that the standardized wireformat for DNS data used in Unbound is less optimal than the object-based of PowerDNS Recursor. On the other hand, the study showed that Unbound which was procedurally written in the C language was able to manage the cache more sparingly than the object- oriented PowerDNS Recursor which was developed in C++. / Domännamnsystemet, DNS, utgör en fundamental del av användbarheten för Internet, men dess cachefunktion utmanas av adressers ökande storlek, antal och automatisering. Parallellt råder begränsad minneskapacitet hos vissa enheter i Internets utkant mot Internet of Things. Studien har tittat närmare på nutida behov av namnuppslagning och har då betraktat hur DNS påverkats av IPv6- adressutbredning, mobila enheter, innehållsleveransnätverk och webbläsarfunktioner. Undersökningen har i två fritt tillgängliga serverprogramvaror för DNS-uppslag sökt efter den optimala hanteringen av cache hos begränsade enheter i, eller på gränsen till, Sakernas Internet. Med hjälp av tillgången till öppen källkod för programmen, Unbound och PowerDNS Recursor, har dess respektive strukturer tolkats för att uppskatta och jämföra minnesbehov. Därefter har en simulering gjorts i en laborativ miljö med fiktiva DNS-data av verklighetstrogen karaktär för att mäta den faktiska förbrukningen av minne på DNS-serverns process. Vid simuleringen undveks att individuellt anpassa programmens inställningar, att blanda in data för DNSSEC, samt att införa minnesbegränsningar i testmiljön. Undersökningen av källkod beräknade att Unbound var mer optimalt för posttyperna A+AAAA medan PowerDNS Recursor var effektivare för posttypen PTR. För båda posttyperna som helhet visade mätningarna i simuleringen att Unbound kunde lagra DNS-data tätare än PowerDNS Recursor. Resultatet har visat att det standardiserade meddelandeformatet för DNS-data som används i Unbound är mindre optimalt än det objektbaserade i PowerDNS Recursor. Å andra sidan visades att Unbound som var procedurellt skrivet i programspråket C lyckades hushålla med cacheminnet bättre än det objektorienterade PowerDNS Recursor som utvecklats i C++.

Page generated in 0.0722 seconds