1 |
Placement of replicas in large-scale data grid environmentsShorfuzzaman, Mohammad 26 March 2012 (has links)
Data Grids provide services and infrastructure for distributed data-intensive applications accessing massive geographically distributed datasets. An important technique to speed access in Data Grids is replication, which provides nearby data access. Although data replication is one of the major techniques for promoting high data access, the problem of replica placement has not been widely studied for large-scale Grid environments. In this thesis, I propose improved data placement techniques useful when replicating potentially large data files in wide area data grids. These techniques are aimed at achieving faster data access as well as efficient utilization of bandwidth and storage resources. At the core of my approach is a new highly distributed replica placement algorithm that places data in strategic locations to improve overall data access performance while satisfying varying user/application and system demands. This improved efficiency of access to large data will improve the practicality of large-scale data and compute intensive collaborative scientific endeavors.
My thesis makes several contributions towards improving the state-of-the-art for replica placement in large-scale data grid environments. The major contributions are: (i) development of a new popularity-driven dynamic replica placement algorithm for hierarchically structured data grids that balance storage space utilisation and access latency; (ii) creation of an adaptive version of the base algorithm to dynamically adapt the frequency and degree of replication based on such factors as data request arrival rates, available storage capacities, etc.; (iii) development of a new highly distributed algorithm to determine a near-optimal replica placement while minimizing replication cost (access and update) for a given traffic pattern; (iv) creation of a distributed QoS-aware replica placement algorithm that supports multiple quality requirements both from user and system perspectives to support efficient transfers of large replicas.
Simulation results using widely observed data access patterns demonstrate how the effectiveness of my replica placement techniques is affected by various factors such as grid network characteristics (i.e. topology, number of nodes, storage and workload capacities of replica servers, link capacities, traffic pattern), QoS requirements, and so on. Finally, I compare the performance of my algorithms to a number of relevant algorithms from the literature and demonstrate their usefulness and superiority for conditions of interest.
|
2 |
Placement of replicas in large-scale data grid environmentsShorfuzzaman, Mohammad 26 March 2012 (has links)
Data Grids provide services and infrastructure for distributed data-intensive applications accessing massive geographically distributed datasets. An important technique to speed access in Data Grids is replication, which provides nearby data access. Although data replication is one of the major techniques for promoting high data access, the problem of replica placement has not been widely studied for large-scale Grid environments. In this thesis, I propose improved data placement techniques useful when replicating potentially large data files in wide area data grids. These techniques are aimed at achieving faster data access as well as efficient utilization of bandwidth and storage resources. At the core of my approach is a new highly distributed replica placement algorithm that places data in strategic locations to improve overall data access performance while satisfying varying user/application and system demands. This improved efficiency of access to large data will improve the practicality of large-scale data and compute intensive collaborative scientific endeavors.
My thesis makes several contributions towards improving the state-of-the-art for replica placement in large-scale data grid environments. The major contributions are: (i) development of a new popularity-driven dynamic replica placement algorithm for hierarchically structured data grids that balance storage space utilisation and access latency; (ii) creation of an adaptive version of the base algorithm to dynamically adapt the frequency and degree of replication based on such factors as data request arrival rates, available storage capacities, etc.; (iii) development of a new highly distributed algorithm to determine a near-optimal replica placement while minimizing replication cost (access and update) for a given traffic pattern; (iv) creation of a distributed QoS-aware replica placement algorithm that supports multiple quality requirements both from user and system perspectives to support efficient transfers of large replicas.
Simulation results using widely observed data access patterns demonstrate how the effectiveness of my replica placement techniques is affected by various factors such as grid network characteristics (i.e. topology, number of nodes, storage and workload capacities of replica servers, link capacities, traffic pattern), QoS requirements, and so on. Finally, I compare the performance of my algorithms to a number of relevant algorithms from the literature and demonstrate their usefulness and superiority for conditions of interest.
|
3 |
Algorithmic Mechanism Design for Data Replication ProblemsGuo, Minzhe 13 September 2016 (has links)
No description available.
|
4 |
The design and implementation of a robust, cost-conscious peer-to-peer lookup serviceHarvesf, Cyrus Mehrabaun 17 November 2008 (has links)
Peer-to-peer (p2p) technology provides an excellent platform for the delivery of rich content and media that scales with the rapid growth of the Internet. This work presents a lookup service design and implementation that provides provable fault tolerance and operates in a cost-conscious manner over the Internet.
<br><br>
Using a distributed hash table (DHT) as a foundation, we propose a replica placement that improves object availability and reachability to implement a robust lookup service. We present a framework that describes tree-based routing DHTs and formally prove several properties for DHTs of this type. Specifically, we prove that our replica placement, which we call MaxDisjoint, creates a provable number of disjoint routes from any source node to a replica set. We evaluate this technique through simulation and demonstrate that it creates disjoint routes more effectively than existing replica placements. Furthermore, we show that disjoint routes have a marked impact on routing robustness, which we measure as the probability of lookup success.
<br><br>
To mitigate the costs incurred by multi-hop DHT routing, we develop an organization-based id assignment scheme that bounds the transit costs of prefix-matching routes. To further reduce costs, we use MaxDisjoint placement to create multiple routes of varying costs. This technique helps reduce cost in two ways: (1) replication may create local copies of an object that can be accessed at zero transit cost and (2) MaxDisjoint replication creates multiple, bounded cost, disjoint routes of which the minimal cost route can be used to resolve the lookup. We model the trade-off between the storage cost and routing cost benefit of replication to find the optimal degree to which an object should be replicated. We evaluate our approach using a lookup service implementation and show that it dramatically reduces cost over existing DHT implementations. Furthermore, we show that our technique can be used to manage objects of varying popularity in a manner that is more cost effective than caching.
<br><br>
By improving its robustness and cost effectiveness, we aim to increase the pervasiveness of p2p in practice and unlock the potential of this powerful technology.
|
5 |
Adaptivitätssensitive Platzierung von Replikaten in Adaptiven Content Distribution Networks / Adaptation-aware Replica Placement in Adaptive Content Distribution NetworksBuchholz, Sven 14 June 2005 (has links) (PDF)
Adaptive Content Distribution Networks (A-CDNs) sind anwendungsübergreifende, verteilte Infrastrukturen, die auf Grundlage verteilter Replikation von Inhalten und Inhaltsadaption eine skalierbare Auslieferung von adaptierbaren multimedialen Inhalten an heterogene Clients ermöglichen. Die Platzierung der Replikate in den Surrogaten eines A-CDN wird durch den Platzierungsmechanismus des A-CDN gesteuert. Anders als in herkömmlichen CDNs, die keine Inhaltsadaption berücksichtigen, muss ein Platzierungsmechanismus in einem A-CDN nicht nur entscheiden, welches Inhaltsobjekt in welchem Surrogat repliziert werden soll, sondern darüber hinaus, in welcher Repräsentation bzw. in welchen Repräsentationen das Inhaltsobjekt zu replizieren ist. Herkömmliche Platzierungsmechanismen sind nicht in der Lage, verschiedene Repräsentationen eines Inhaltsobjektes zu berücksichtigen. Beim Einsatz herkömmlicher Platzierungsmechanismen in A-CDNs können deshalb entweder nur statisch voradaptierte Repräsentationen oder ausschließlich generische Repräsentationen repliziert werden. Während bei der Replikation von statisch voradaptierten Repräsentationen die Wiederverwendbarkeit der Replikate eingeschränkt ist, führt die Replikation der generischen Repräsentationen zu erhöhten Kosten und Verzögerungen für die dynamische Adaption der Inhalte bei jeder Anfrage. Deshalb werden in der Arbeit adaptivitätssensitive Platzierungsmechanismen zur Platzierung von Replikaten in A-CDNs vorgeschlagen. Durch die Berücksichtigung der Adaptierbarkeit der Inhalte bei der Ermittlung einer Platzierung von Replikaten in den Surrogaten des A-CDNs können adaptivitätssensitive Platzierungsmechanismen sowohl generische und statisch voradaptierte als auch teilweise adaptierte Repräsentationen replizieren. Somit sind sie in der Lage statische und dynamische Inhaltsadaption flexibel miteinander zu kombinieren. Das Ziel der vorliegenden Arbeit ist zu evaluieren, welche Vorteile sich durch die Berücksichtigung der Inhaltsadaption bei Platzierung von adaptierbaren Inhalten in A-CDNs realisieren lassen. Hierzu wird das Problem der adaptivitätssensitiven Platzierung von Replikaten in A-CDNs als Optimierungsproblem formalisiert, Algorithmen zur Lösung des Optimierungsproblems vorgeschlagen und diese in einem Simulator implementiert. Das zugrunde liegende Simulationsmodell beschreibt ein im Internet verteiltes A-CDN, welches zur Auslieferung von JPEG-Bildern an heterogene mobile und stationäre Clients verwendet wird. Anhand dieses Simulationsmodells wird die Leistungsfähigkeit der adaptivitätssensitiven Platzierungsmechanismen evaluiert und mit der von herkömmlichen Platzierungsmechanismen verglichen. Die Simulationen zeigen, dass der adaptivitätssensitive Ansatz in Abhängigkeit vom System- und Lastmodell sowie von der Speicherkapazität der Surrogate im A-CDN in vielen Fällen Vorteile gegenüber dem Einsatz herkömmlicher Platzierungsmechanismen mit sich bringt. Wenn sich die Anfragelasten verschiedener Typen von Clients jedoch nur wenig oder gar nicht überlappen oder bei hinreichend großer Speicherkapazität der Surrogate hat der adaptivitätssensitive Ansatz keine signifikanten Vorteile gegenüber dem Einsatz eines herkömmlichen Platzierungsmechanismus. / Adaptive Content Distribution Networks (A-CDNs) are application independent, distributed infrastructures using content adaptation and distributed replication of contents to allow the scalable delivery of adaptable multimedia contents to heterogeneous clients. The replica placement in an A-CDN is controlled by the placement mechanisms of the A-CDN. As opposed to traditional CDNs, which do not take content adaptation into consideration, a replica placement mechanism in an A-CDN has to decide not only which object shall be stored in which surrogate but also which representation or which representations of the object to replicate. Traditional replica placement mechanisms are incapable of taking different representations of the same object into consideration. That is why A-CDNs that use traditional replica placement mechanisms may only replicate generic or statically adapted representations. The replication of statically adapted representations reduces the sharing of the replicas. The replication of generic representations results in adaptation costs and delays with every request. That is why the dissertation thesis proposes the application of adaptation-aware replica placement mechanisms. By taking the adaptability of the contents into account, adaptation-aware replica placement mechanisms may replicate generic, statically adapted and even partially adapted representations of an object. Thus, they are able to balance between static and dynamic content adaptation. The dissertation is targeted at the evaluation of the performance advantages of taking knowledge about the adaptability of contents into consideration when calculating a placement of replicas in an A-CDN. Therefore the problem of adaptation-aware replica placement is formalized as an optimization problem; algorithms for solving the optimization problem are proposed and implemented in a simulator. The underlying simulation model describes an Internet-wide distributed A-CDN that is used for the delivery of JPEG images to heterogeneous mobile and stationary clients. Based on the simulation model, the performance of the adaptation-aware replica placement mechanisms are evaluated and compared to traditional replica placement mechanisms. The simulations prove that the adaptation-aware approach is superior to the traditional replica placement mechanisms in many cases depending on the system and load model as well as the storage capacity of the surrogates of the A-CDN. However, if the load of different types of clients do hardly overlap or with sufficient storage capacity within the surrogates, the adaptation-aware approach has no significant advantages over the application of traditional replica-placement mechanisms.
|
6 |
Adaptivitätssensitive Platzierung von Replikaten in Adaptiven Content Distribution NetworksBuchholz, Sven 08 July 2005 (has links)
Adaptive Content Distribution Networks (A-CDNs) sind anwendungsübergreifende, verteilte Infrastrukturen, die auf Grundlage verteilter Replikation von Inhalten und Inhaltsadaption eine skalierbare Auslieferung von adaptierbaren multimedialen Inhalten an heterogene Clients ermöglichen. Die Platzierung der Replikate in den Surrogaten eines A-CDN wird durch den Platzierungsmechanismus des A-CDN gesteuert. Anders als in herkömmlichen CDNs, die keine Inhaltsadaption berücksichtigen, muss ein Platzierungsmechanismus in einem A-CDN nicht nur entscheiden, welches Inhaltsobjekt in welchem Surrogat repliziert werden soll, sondern darüber hinaus, in welcher Repräsentation bzw. in welchen Repräsentationen das Inhaltsobjekt zu replizieren ist. Herkömmliche Platzierungsmechanismen sind nicht in der Lage, verschiedene Repräsentationen eines Inhaltsobjektes zu berücksichtigen. Beim Einsatz herkömmlicher Platzierungsmechanismen in A-CDNs können deshalb entweder nur statisch voradaptierte Repräsentationen oder ausschließlich generische Repräsentationen repliziert werden. Während bei der Replikation von statisch voradaptierten Repräsentationen die Wiederverwendbarkeit der Replikate eingeschränkt ist, führt die Replikation der generischen Repräsentationen zu erhöhten Kosten und Verzögerungen für die dynamische Adaption der Inhalte bei jeder Anfrage. Deshalb werden in der Arbeit adaptivitätssensitive Platzierungsmechanismen zur Platzierung von Replikaten in A-CDNs vorgeschlagen. Durch die Berücksichtigung der Adaptierbarkeit der Inhalte bei der Ermittlung einer Platzierung von Replikaten in den Surrogaten des A-CDNs können adaptivitätssensitive Platzierungsmechanismen sowohl generische und statisch voradaptierte als auch teilweise adaptierte Repräsentationen replizieren. Somit sind sie in der Lage statische und dynamische Inhaltsadaption flexibel miteinander zu kombinieren. Das Ziel der vorliegenden Arbeit ist zu evaluieren, welche Vorteile sich durch die Berücksichtigung der Inhaltsadaption bei Platzierung von adaptierbaren Inhalten in A-CDNs realisieren lassen. Hierzu wird das Problem der adaptivitätssensitiven Platzierung von Replikaten in A-CDNs als Optimierungsproblem formalisiert, Algorithmen zur Lösung des Optimierungsproblems vorgeschlagen und diese in einem Simulator implementiert. Das zugrunde liegende Simulationsmodell beschreibt ein im Internet verteiltes A-CDN, welches zur Auslieferung von JPEG-Bildern an heterogene mobile und stationäre Clients verwendet wird. Anhand dieses Simulationsmodells wird die Leistungsfähigkeit der adaptivitätssensitiven Platzierungsmechanismen evaluiert und mit der von herkömmlichen Platzierungsmechanismen verglichen. Die Simulationen zeigen, dass der adaptivitätssensitive Ansatz in Abhängigkeit vom System- und Lastmodell sowie von der Speicherkapazität der Surrogate im A-CDN in vielen Fällen Vorteile gegenüber dem Einsatz herkömmlicher Platzierungsmechanismen mit sich bringt. Wenn sich die Anfragelasten verschiedener Typen von Clients jedoch nur wenig oder gar nicht überlappen oder bei hinreichend großer Speicherkapazität der Surrogate hat der adaptivitätssensitive Ansatz keine signifikanten Vorteile gegenüber dem Einsatz eines herkömmlichen Platzierungsmechanismus. / Adaptive Content Distribution Networks (A-CDNs) are application independent, distributed infrastructures using content adaptation and distributed replication of contents to allow the scalable delivery of adaptable multimedia contents to heterogeneous clients. The replica placement in an A-CDN is controlled by the placement mechanisms of the A-CDN. As opposed to traditional CDNs, which do not take content adaptation into consideration, a replica placement mechanism in an A-CDN has to decide not only which object shall be stored in which surrogate but also which representation or which representations of the object to replicate. Traditional replica placement mechanisms are incapable of taking different representations of the same object into consideration. That is why A-CDNs that use traditional replica placement mechanisms may only replicate generic or statically adapted representations. The replication of statically adapted representations reduces the sharing of the replicas. The replication of generic representations results in adaptation costs and delays with every request. That is why the dissertation thesis proposes the application of adaptation-aware replica placement mechanisms. By taking the adaptability of the contents into account, adaptation-aware replica placement mechanisms may replicate generic, statically adapted and even partially adapted representations of an object. Thus, they are able to balance between static and dynamic content adaptation. The dissertation is targeted at the evaluation of the performance advantages of taking knowledge about the adaptability of contents into consideration when calculating a placement of replicas in an A-CDN. Therefore the problem of adaptation-aware replica placement is formalized as an optimization problem; algorithms for solving the optimization problem are proposed and implemented in a simulator. The underlying simulation model describes an Internet-wide distributed A-CDN that is used for the delivery of JPEG images to heterogeneous mobile and stationary clients. Based on the simulation model, the performance of the adaptation-aware replica placement mechanisms are evaluated and compared to traditional replica placement mechanisms. The simulations prove that the adaptation-aware approach is superior to the traditional replica placement mechanisms in many cases depending on the system and load model as well as the storage capacity of the surrogates of the A-CDN. However, if the load of different types of clients do hardly overlap or with sufficient storage capacity within the surrogates, the adaptation-aware approach has no significant advantages over the application of traditional replica-placement mechanisms.
|
7 |
Energy-aware scheduling : complexity and algorithmsRenaud-Goud, Paul 05 July 2012 (has links) (PDF)
In this thesis we have tackled a few scheduling problems under energy constraint, since the energy issue is becoming crucial, for both economical and environmental reasons. In the first chapter, we exhibit tight bounds on the energy metric of a classical algorithm that minimizes the makespan of independent tasks. In the second chapter, we schedule several independent but concurrent pipelined applications and address problems combining multiple criteria, which are period, latency and energy. We perform an exhaustive complexity study and describe the performance of new heuristics. In the third chapter, we study the replica placement problem in a tree network. We try to minimize the energy consumption in a dynamic frame. After a complexity study, we confirm the quality of our heuristics through a complete set of simulations. In the fourth chapter, we come back to streaming applications, but in the form of series-parallel graphs, and try to map them onto a chip multiprocessor. The design of a polynomial algorithm on a simple problem allows us to derive heuristics on the most general problem, whose NP-completeness has been proven. In the fifth chapter, we study energy bounds of different routing policies in chip multiprocessors, compared to the classical XY routing, and develop new routing heuristics. In the last chapter, we compare the performance of different algorithms of the literature that tackle the problem of mapping DAG applications to minimize the energy consumption.
|
8 |
Energy-Efficient Key/Value StoreTena, Frezewd Lemma 11 September 2017 (has links) (PDF)
Energy conservation is a major concern in todays data centers, which are the 21st century data processing factories, and where large and complex software systems such as distributed data management stores run and serve billions of users. The two main drivers of this major concern are the pollution impact data centers have on the environment due to their waste heat, and the expensive cost data centers incur due to their enormous energy demand. Among the many subsystems of data centers, the storage system is one of the main sources of energy consumption. Among the many types of storage systems, key/value stores happen to be the widely used in the data centers. In this work, I investigate energy saving techniques that enable a consistent hash based key/value store save energy during low activity times, and whenever there is an opportunity to reuse the waste heat of data centers.
|
9 |
Energy-Efficient Key/Value StoreTena, Frezewd Lemma 29 August 2017 (has links)
Energy conservation is a major concern in todays data centers, which are the 21st century data processing factories, and where large and complex software systems such as distributed data management stores run and serve billions of users. The two main drivers of this major concern are the pollution impact data centers have on the environment due to their waste heat, and the expensive cost data centers incur due to their enormous energy demand. Among the many subsystems of data centers, the storage system is one of the main sources of energy consumption. Among the many types of storage systems, key/value stores happen to be the widely used in the data centers. In this work, I investigate energy saving techniques that enable a consistent hash based key/value store save energy during low activity times, and whenever there is an opportunity to reuse the waste heat of data centers.
|
10 |
Energy-aware scheduling : complexity and algorithms / Ordonnancement sous contrainte d'énergie : complexité et algorithmesRenaud-Goud, Paul 05 July 2012 (has links)
Dans cette thèse, nous nous sommes intéressés à des problèmes d'ordonnancement sous contrainte d'énergie, puisque la réduction de l'énergie est devenue une nécessité, tant sur le plan économique qu'écologique. Dans le premier chapitre, nous exhibons des bornes strictes sur l'énergie d'un algorithme classique qui minimise le temps d'exécution de tâches indépendantes. Dans le second chapitre, nous ordonnançons plusieurs applications chaînées de type « streaming », et nous étudions des problèmes contraignant l'énergie, la période et la latence. Nous effectuons une étude de complexité exhaustive, et décrivons les performances de nouvelles heuristiques. Dans le troisième chapitre, nous étudions le problème de placement de répliques dans un réseau arborescent. Nous nous plaçons dans un cadre dynamique, et nous bornons à minimiser l'énergie. Après une étude de complexité, nous confirmons la qualité de nos heuristiques grâce à un jeu complet de simulations. Dans le quatrième chapitre, nous revenons aux applications « streaming », mais sous forme de graphes série-parallèles, et nous tentons de les placer sur un processeur multi-cœur. La découverte d'un algorithme polynomial sur un problème simple nous permet la conception d'heuristiques sur le problème le plus général dont nous avons établi la NP-complétude. Dans le cinquième chapitre, nous étudions des bornes énergétiques de politiques de routage dans des processeurs multi-cœurs, en comparaison avec le routage classique XY, et développons de nouvheuristiques de routage. Dans le dernier chapitre, nous étudions expérimentalement le placement d'applications sous forme de DAG sur des machines réelles. / In this thesis we have tackled a few scheduling problems under energy constraint, since the energy issue is becoming crucial, for both economical and environmental reasons. In the first chapter, we exhibit tight bounds on the energy metric of a classical algorithm that minimizes the makespan of independent tasks. In the second chapter, we schedule several independent but concurrent pipelined applications and address problems combining multiple criteria, which are period, latency and energy. We perform an exhaustive complexity study and describe the performance of new heuristics. In the third chapter, we study the replica placement problem in a tree network. We try to minimize the energy consumption in a dynamic frame. After a complexity study, we confirm the quality of our heuristics through a complete set of simulations. In the fourth chapter, we come back to streaming applications, but in the form of series-parallel graphs, and try to map them onto a chip multiprocessor. The design of a polynomial algorithm on a simple problem allows us to derive heuristics on the most general problem, whose NP-completeness has been proven. In the fifth chapter, we study energy bounds of different routing policies in chip multiprocessors, compared to the classical XY routing, and develop new routing heuristics. In the last chapter, we compare the performance of different algorithms of the literature that tackle the problem of mapping DAG applications to minimize the energy consumption.
|
Page generated in 0.0957 seconds