Global ETD Search

21	Mobility And Power Aware Data Interest Based Data Replication For Mobile Ad Hoc Networks Arslan, Secil 01 September 2007 (has links) (PDF) One of the challenging issues for mobile ad hoc network (MANET) applications is data replication. Unreliable wireless communication, mobility of network participators and limited resource capacities of mobile devices make conventional replication techniques useless for MANETs. Frequent network divisions and unexpected disconnections should be handled. In this thesis work, a novel mobility and power aware, data interest based data replication strategy is presented. Main objective is to improve data accessibility among a mission critical mobility group. A clustering approach depending on mobility and data interest patterns similarities is introduced. The investigated replica allocation methodology takes care of data access frequency and data correlation values together with mobile nodes&rsquo / remaining energy and memory capacities. Performance of the proposed approach is analyzed in terms of data accessibility / cache hit ratio and traffic metrics. Improvements are observed by data interest based clustering in addition to mobility awareness over sole mobility aware clustering. Advantages of power aware replica allocation are demonstrated by experimental simulations. QA Computer Software 76.75-76.765
22	Dinaminis kompiuterinių sistemų infrastruktūros atnaujinimo modelis, pagrįstas atviro kodo sprendimais / Dynamic update model of computer systems infrastructure based on open source solutions Pachomov, Artiom 17 July 2014 (has links) Šiame darbe analizuojamos įmonės su užsistovėjusia bei pasenusia programine įranga dinaminis atnaujinimas utilizuojant naujos atviro kodo nemokamos įrangos galimybes. Formuojamas dinaminis modelis, kuriame pritaikomi nepertraukiamų paslaugų, vartotojų centralizuoto valdymo bei neprarandamų duomenų sprendimai. Taip pat pateikiama analizė, kaip atlikti paslaugų įrangos migravimą bei sukurti pagalbinę IT infrastruktūros dalį, optimizuojančia sistemų priežiūrą bei našumą. / This paper analyzes, dynamic systems software updates for institution with deprecated system infrastructure using free, open source based solutions using most of it possibilities. Dynamic model is formed, which includes identity management, high availability clustering, data replication and data integrity solutions. Also additional analysis is included for IT infrastructure usage optimization. Informatics Duomenų replikavimas Klasteris Nepertraukiamos paslaugos tiekimas Paslaugų migravimas Clustering High availability Data replication Uninterrupted service supply
23	Análise de mecanismos de replicação de dados para grades de computadores / Analysis of data replication mechanisms for grid computing Leonardo Pancieri Ferreira de Freitas 28 April 2010 (has links) Uma grade de dados (datagrid) é um ambiente computacional distribuído destinado a agregar e compartilhar recursos de armazenamento que estão geograficamente separados. As grades de dados provêm infra-estrutura e serviços para descoberta, transferência, manipulação e gerenciamento de grandes quantidades de dados armazenados em repositórios distribuídos. As grades de dados possuem características em comum com redes peer-to-peer, bancos de dados distribuídos e redes de distribuição de conteúdo (CDN). Muitas das políticas de replicação, substituição e busca utilizadas nestas redes são comuns com as de grades de dados. O foco deste trabalho é estudar a influência das políticas (replicação, substituição e busca), topologias de rede e interação entre políticas no desempenho para armazenamento, busca de arquivos, sobrevivência de réplicas e tráfego de rede. O estudo também considera as interações das diversas políticas com estrutura topológicas da rede e falhas em nós. A metodologia adotada para realizar avaliações no trabalho foi a simulação. Com o simulador foi possível concentrar os estudos nas relações entre as políticas e topologias de rede, evitando complexidade de uma grade como heterogeneidade de recursos. Nos resultados foram encontrados efeitos importantes de características topológicas em redes e sua interação com políticas utilizadas no desempenho da grade. / A datagrid is an distributed computing environment designed to aggregate and share storage resources that are geographically distant. Datagrids provides infrastructure and services for discovery, transfer, handling and managing large amounts of data stored in distributed repositories. The datagrids share characteristics with peer-to-peer networks, distributed databases and content distribution networks (CDN). Many of the replication, placement and search policies used in these networks are common with datagrids. The focus of this work is to study the influence of policies (replication, placement and search), network topologies and their interaction on the performance for storage, file searching, replica survival and network traffic. This research also considers the interactions of various policies with topologies of networks under node failures. The methodology adopted to carry out the evaluation in this work is the use of simulation. With a simulator it is possible to concentrate studies on the relationships between policies and network topologies, avoiding the complexity of a grid with heterogeneous resources. Results show significant effects on grid performance of topological features of the networks and its interaction with policies. Grades de dados Políticas de busca Políticas de replicação Políticas de substituição Replicação de dados Data replication Datagrids Placement strategies Replication strategies Search strategies
24	Metadados para reconciliação de transações em bancos de dados autônomos / Metadata for transaction reconciliation in autonomous databases Duarte, Gustavo Luiz 19 December 2011 (has links) O uso de técnicas de replicação de dados em dispositivos móveis permite que uma aplicação móvel compartilhe dados com um servidor e possa atuar sobre tais dados durante períodos de desconexão. Embora essa característica seja fundamental em diversos domínios, a reconciliação das transações que foram aplicadas sobre a réplica móvel dos dados apresenta-se como um desafio a ser superado. O uso de bloqueios apresenta-se impraticável em determinados domínios de aplicação. Por outro lado, ao permitir a execução de operações de escrita em diversas réplicas sem uma sincronização a priori, o sistema se torna suscetível a conflitos de atualização, sendo necessário a implementação de um mecanismo de resolução de conflitos. Resolver conflitos é uma tarefa complexa e propensa a erros, em especial nos casos em que há a necessidade de intervenção humana. Diante desse cenário, foi desenvolvido um modelo para controle de transações em bancos de dados autônomos que faz uso de metadados e multiversão de banco de dados de forma a permitir a auditoria e retificação de resoluções de conflitos. Isso torna a resolução de conflitos uma operação não destrutiva, reduzindo, assim, o impacto de uma resolução de conflito incorreta. Neste trabalho é apresentado também um arcabouço para reconciliação de transações que implementa o modelo proposto. Como estudo de caso, o arcabouço desenvolvido foi utilizado para implementar a integração entre dois sistemas reais que possuem necessidades de replicação de dados e atualizações desconectadas. / The use of data replication techniques on mobile devices allows a mobile application to share data with a server and to work on such data while disconnected. While this feature is crucial in some application domains, the reconciliation of transactions applied to the mobile replica of data proves to be challenging. The use of locking is not feasible in some application domains. However, allowing write operations to be applied on several replicas without \\emph{a priori} synchronization makes the system susceptible to update conflicts, requiring a conflict resolution mechanism. Conflict resolution is a complex and error prone task, specially when human intervention is involved. Given this scenario, we developed a transactions control model for autonomous databases that uses metadata and database versioning to provide auditing and rectification of conflict resolutions. This turns the conflict resolution into a nondestructive operation, thus reducing the impact of an incorrect conflict resolution. This work presents also a framework for transaction reconciliation that implements the proposed model. As a case study, the developed framework was used to integrate two real systems that needed data replication and disconnected updates. autonomous databases bancos de dados autônomos bancos de dados móveis conflicts conflitos data replication mobile database reconciliação de transações replicação de dados transaction reconciliation versionamento versioning
25	Réplication de données dans les systèmes de gestion de données à grande échelle / Data replication in large-scale data management systems Tos, Uras 27 June 2017 (has links) Ces dernières années, la popularité croissante des applications, e.g. les expériences scientifiques, Internet des objets et les réseaux sociaux, a conduit à la génération de gros volumes de données. La gestion de telles données qui de plus, sont hétérogènes et distribuées à grande échelle, constitue un défi important. Dans les systèmes traditionnels tels que les systèmes distribués et parallèles, les systèmes pair-à-pair et les systèmes de grille, répondre à des objectifs tels que l'obtention de performances acceptables tout en garantissant une bonne disponibilité de données constituent des objectifs majeurs pour l'utilisateur, en particulier lorsque ces données sont réparties à travers le monde. Dans ce contexte, la réplication de données, une technique très connue, permet notamment: (i) d'augmenter la disponibilité de données, (ii) de réduire les coûts d'accès aux données et (iii) d'assurer une meilleure tolérance aux pannes. Néanmoins, répliquer les données sur tous les nœuds est une solution non réaliste vu qu'elle génère une consommation importante de la bande passante en plus de l'espace limité de stockage. Définir des stratégies de réplication constitue la solution à apporter à ces problématiques. Les stratégies de réplication de données qui ont été proposées pour les systèmes traditionnels cités précédemment ont pour objectif l'amélioration des performances pour l'utilisateur. Elles sont difficiles à adapter dans les systèmes de cloud. En effet, le fournisseur de cloud a pour but de générer un profit en plus de répondre aux exigences des locataires. Satisfaire les attentes de ces locataire en matière de performances sans sacrifier le profit du fournisseur d'un coté et la gestion élastiques des ressources avec une tarification suivant le modèle 'pay-as-you-go' d'un autre coté, constituent des principes fondamentaux dans les systèmes cloud. Dans cette thèse, nous proposons une stratégie de réplication de données pour satisfaire les exigences du locataire, e.g. les performances, tout en garantissant le profit économique du fournisseur. En se basant sur un modèle de coût, nous estimons le temps de réponse nécessaire pour l'exécution d'une requête distribuée. La réplication de données n'est envisagée que si le temps de réponse estimé dépasse un seuil fixé auparavant dans le contrat établi entre le fournisseur et le client. Ensuite, cette réplication doit être profitable du point de vue économique pour le fournisseur. Dans ce contexte, nous proposons un modèle économique prenant en compte aussi bien les dépenses et les revenus du fournisseur lors de l'exécution de cette requête. Nous proposons une heuristique pour le placement des répliques afin de réduire les temps d'accès à ces nouvelles répliques. De plus, un ajustement du nombre de répliques est adopté afin de permettre une gestion élastique des ressources. Nous validons la stratégie proposée par une évaluation basée sur une simulation. Nous comparons les performances de notre stratégie à celles d'une autre stratégie de réplication proposée dans les clouds. L'analyse des résultats obtenus a montré que les deux stratégies comparées répondent à l'objectif de performances pour le locataire. Néanmoins, une réplique de données n'est crée, avec notre stratégie, que si cette réplication est profitable pour le fournisseur. / In recent years, growing popularity of large-scale applications, e.g. scientific experiments, Internet of things and social networking, led to generation of large volumes of data. The management of this data presents a significant challenge as the data is heterogeneous and distributed on a large scale. In traditional systems including distributed and parallel systems, peer-to-peer systems and grid systems, meeting objectives such as achieving acceptable performance while ensuring good availability of data are major challenges for service providers, especially when the data is distributed around the world. In this context, data replication, as a well-known technique, allows: (i) increased data availability, (ii) reduced data access costs, and (iii) improved fault-tolerance. However, replicating data on all nodes is an unrealistic solution as it generates significant bandwidth consumption in addition to exhausting limited storage space. Defining good replication strategies is a solution to these problems. The data replication strategies that have been proposed for the traditional systems mentioned above are intended to improve performance for the user. They are difficult to adapt to cloud systems. Indeed, cloud providers aim to generate a profit in addition to meeting tenant requirements. Meeting the performance expectations of the tenants without sacrificing the provider's profit, as well as managing resource elasticities with a pay-as-you-go pricing model, are the fundamentals of cloud systems. In this thesis, we propose a data replication strategy that satisfies the requirements of the tenant, such as performance, while guaranteeing the economic profit of the provider. Based on a cost model, we estimate the response time required to execute a distributed database query. Data replication is only considered if, for any query, the estimated response time exceeds a threshold previously set in the contract between the provider and the tenant. Then, the planned replication must also be economically beneficial to the provider. In this context, we propose an economic model that takes into account both the expenditures and the revenues of the provider during the execution of any particular database query. Once the data replication is decided to go through, a heuristic placement approach is used to find the placement for new replicas in order to reduce the access time. In addition, a dynamic adjustment of the number of replicas is adopted to allow elastic management of resources. Proposed strategy is validated in an experimental evaluation carried out in a simulation environment. Compared with another data replication strategy proposed in the cloud systems, the analysis of the obtained results shows that the two compared strategies respond to the performance objective for the tenant. Nevertheless, a replica of data is created, with our strategy, only if this replication is profitable for the provider. Systèmes cloud Requêtes de base de données Réplication de données Evaluation de performances Profit économique Cloud Computing Database Queries Data Replication Performance Evaluation Economic Benefit
26	Metadados para reconciliação de transações em bancos de dados autônomos / Metadata for transaction reconciliation in autonomous databases Gustavo Luiz Duarte 19 December 2011 (has links) O uso de técnicas de replicação de dados em dispositivos móveis permite que uma aplicação móvel compartilhe dados com um servidor e possa atuar sobre tais dados durante períodos de desconexão. Embora essa característica seja fundamental em diversos domínios, a reconciliação das transações que foram aplicadas sobre a réplica móvel dos dados apresenta-se como um desafio a ser superado. O uso de bloqueios apresenta-se impraticável em determinados domínios de aplicação. Por outro lado, ao permitir a execução de operações de escrita em diversas réplicas sem uma sincronização a priori, o sistema se torna suscetível a conflitos de atualização, sendo necessário a implementação de um mecanismo de resolução de conflitos. Resolver conflitos é uma tarefa complexa e propensa a erros, em especial nos casos em que há a necessidade de intervenção humana. Diante desse cenário, foi desenvolvido um modelo para controle de transações em bancos de dados autônomos que faz uso de metadados e multiversão de banco de dados de forma a permitir a auditoria e retificação de resoluções de conflitos. Isso torna a resolução de conflitos uma operação não destrutiva, reduzindo, assim, o impacto de uma resolução de conflito incorreta. Neste trabalho é apresentado também um arcabouço para reconciliação de transações que implementa o modelo proposto. Como estudo de caso, o arcabouço desenvolvido foi utilizado para implementar a integração entre dois sistemas reais que possuem necessidades de replicação de dados e atualizações desconectadas. / The use of data replication techniques on mobile devices allows a mobile application to share data with a server and to work on such data while disconnected. While this feature is crucial in some application domains, the reconciliation of transactions applied to the mobile replica of data proves to be challenging. The use of locking is not feasible in some application domains. However, allowing write operations to be applied on several replicas without \\emph{a priori} synchronization makes the system susceptible to update conflicts, requiring a conflict resolution mechanism. Conflict resolution is a complex and error prone task, specially when human intervention is involved. Given this scenario, we developed a transactions control model for autonomous databases that uses metadata and database versioning to provide auditing and rectification of conflict resolutions. This turns the conflict resolution into a nondestructive operation, thus reducing the impact of an incorrect conflict resolution. This work presents also a framework for transaction reconciliation that implements the proposed model. As a case study, the developed framework was used to integrate two real systems that needed data replication and disconnected updates. bancos de dados autônomos bancos de dados móveis conflitos reconciliação de transações replicação de dados versionamento autonomous databases conflicts data replication mobile database transaction reconciliation versioning
27	Increasing data availability in mobile ad-hoc networks : A community-centric and resource-aware replication approach / Vers une meilleure disponibilité des données dans les réseaux ad-hoc mobiles : Proposition d’une méthodologie de réplication fondée sur la notion de communauté d’intérêt et le contrôle des ressources Torbey Takkouz, Zeina 28 September 2012 (has links) Les réseaux ad hoc mobiles sont des réseaux qui se forment spontanément grâce à la présence de terminaux mobiles. Ces réseaux sans fil sont de faible capacité. Les nœuds se déplacent librement et de manière imprévisible et ils se déchargent très rapidement. En conséquence, un réseau MANET est très enclin à subir des partitionnements fréquents. Les applications déployées sur de tels réseaux, souffrent de problèmes de disponibilité des données induits par ces partitionnements. La réplication des données constitue un mécanisme prometteur pour pallier ce problème. Cependant, la mise en œuvre d’un tel mécanisme dans un environnement aussi contraint en ressources constitue un réel défi. L’objectif principal est donc de réaliser un mécanisme peu consommateur en ressources. Le second objectif de la réplication est de permettre le rééquilibrage de la charge induite par les requêtes de données. Le choix des données à répliquer ainsi que celui des nœuds optimaux pour le placement des futurs réplicas est donc crucial, spécialement dans le contexte du MANET. Dans cette thèse, nous proposons CReaM (Community-Centric and Resource-Aware Replication Model”) un modèle de réplication adapté à un réseau MANET. CReaM fonctionne en mode autonomique : les prises de décisions se basent sur des informations collectées dans le voisinage du nœud plutôt que sur des données globalement impliquant tous les nœuds, ce qui permet de réduire le trafic réseau lié à la réplication. Pour réduire l’usage des ressources induit par la réplication sur un nœud, les niveaux de consommation des ressources sont contrôlés par un moniteur. Toute consommation excédant un seuil prédéfini lié à cette ressource déclenche le processus de réplication. Pour permettre le choix de la donnée à répliquer, une classification multi critères a été proposée (rareté de la donnée, sémantique, niveau de demande); et un moteur d’inférence qui prend en compte l’état de consommation des ressources du nœud pour désigner la catégorie la plus adaptée pour choisir la donnée à répliquer. Pour permettre de placer les réplicas au plus près des nœuds intéressés, CReaM propose un mécanisme pour l’identification et le maintien à jour des centres d’intérêt des nœuds. Les utilisateurs intéressés par un même sujet constituent une communauté. Par ailleurs, chaque donnée à répliquer est estampillée par le ou les sujets au(x)quel(s) elle s’apparente. Un nœud désirant placer un réplica apparenté à un sujet choisira le nœud ayant la plus grande communauté sur ce sujet. Les résultats d’expérimentations confirment la capacité de CReaM à améliorer la disponibilité des données au même niveau que les solutions concurrentes, tout en réduisant la charge liée à la réplication. D’autre part, CReaM permet de respecter l’état de consommation des ressources sur les nœuds. / A Mobile Ad-hoc Network is a self-configured infrastructure-less network. It consists of autonomous mobile nodes that communicate over bandwidth-constrained wireless links. Nodes in a MANET are free to move randomly and organize themselves arbitrarily. They can join/quit the network in an unpredictable way; such rapid and untimely disconnections may cause network partitioning. In such cases, the network faces multiple difficulties. One major problem is data availability. Data replication is a possible solution to increase data availability. However, implementing replication in MANET is not a trivial task due to two major issues: the resource-constrained environment and the dynamicity of the environment makes making replication decisions a very tough problem. In this thesis, we propose a fully decentralized replication model for MANETs. This model is called CReaM: “Community-Centric and Resource-Aware Replication Model”. It is designed to cause as little additional network traffic as possible. To preserve device resources, a monitoring mechanism are proposed. When the consumption of one resource exceeds a predefined threshold, replication is initiated with the goal of balancing the load caused by requests over other nodes. The data item to replicate is selected depending on the type of resource that triggered the replication process. The best data item to replicate in case of high CPU consumption is the one that can better alleviate the load of the node, i.e. a highly requested data item. Oppositely, in case of low battery, rare data items are to be replicated (a data item is considered as rare when it is tagged as a hot topic (a topic with a large community of interested users) but has not been disseminated yet to other nodes). To this end, we introduce a data item classification based on multiple criteria e.g., data rarity, level of demand, semantics of the content. To select the replica holder, we propose a lightweight solution to collect information about the interests of participating users. Users interested in the same topic form a so-called “community of interest”. Through a tags analysis, a data item is assigned to one or more communities of interest. Based on this framework of analysis of the social usage of the data, replicas are placed close to the centers of the communities of interest, i.e. on the nodes with the highest connectivity with the members of the community. The results of evaluating CReaM show that CReaM has positive effects on its main objectives. In particular, it imposes a dramatically lower overhead than that of traditional periodical replication systems (less than 50% on average), while it maintains the data availability at a level comparable to those of its adversaries. Réseau Ad Hoc Réseau mobile Réseau mobile ad-hoc Disponibilité des données Réplication des données Communauté d'intérêt Monitoring des ressources Ad hoc Network Mobile Network Data availability Data replication Communities of Interests 621.382 107 2
28	Programming Model and Protocols for Reconfigurable Distributed Systems Arad, Cosmin January 2013 (has links) Distributed systems are everywhere. From large datacenters to mobile devices, an ever richer assortment of applications and services relies on distributed systems, infrastructure, and protocols. Despite their ubiquity, testing and debugging distributed systems remains notoriously hard. Moreover, aside from inherent design challenges posed by partial failure, concurrency, or asynchrony, there remain significant challenges in the implementation of distributed systems. These programming challenges stem from the increasing complexity of the concurrent activities and reactive behaviors in a distributed system on the one hand, and the need to effectively leverage the parallelism offered by modern multi-core hardware, on the other hand. This thesis contributes Kompics, a programming model designed to alleviate some of these challenges. Kompics is a component model and programming framework for building distributed systems by composing message-passing concurrent components. Systems built with Kompics leverage multi-core machines out of the box, and they can be dynamically reconfigured to support hot software upgrades. A simulation framework enables deterministic execution replay for debugging, testing, and reproducible behavior evaluation for large-scale Kompics distributed systems. The same system code is used for both simulation and production deployment, greatly simplifying the system development, testing, and debugging cycle. We highlight the architectural patterns and abstractions facilitated by Kompics through a case study of a non-trivial distributed key-value storage system. CATS is a scalable, fault-tolerant, elastic, and self-managing key-value store which trades off service availability for guarantees of atomic data consistency and tolerance to network partitions. We present the composition architecture for the numerous protocols employed by the CATS system, as well as our methodology for testing the correctness of key CATS algorithms using the Kompics simulation framework. Results from a comprehensive performance evaluation attest that CATS achieves its claimed properties and delivers a level of performance competitive with similar systems which provide only weaker consistency guarantees. More importantly, this testifies that Kompics admits efficient system implementations. Its use as a teaching framework as well as its use for rapid prototyping, development, and evaluation of a myriad of scalable distributed systems, both within and outside our research group, confirm the practicality of Kompics. / Kompics / CATS / REST distributed systems programming model message-passing concurrency nested hierarchical composition reactive components software architecture dynamic reconfiguration multi-core discrete-event simulation peer-to-peer testing debugging distributed key-value stores data replication consistency linearizability network partition tolerance consistent hashing self-organization scalability elasticity fault tolerance consistent quorums
29	Programming Model and Protocols for Reconfigurable Distributed Systems Arad, Cosmin Ionel January 2013 (has links) Distributed systems are everywhere. From large datacenters to mobile devices, an ever richer assortment of applications and services relies on distributed systems, infrastructure, and protocols. Despite their ubiquity, testing and debugging distributed systems remains notoriously hard. Moreover, aside from inherent design challenges posed by partial failure, concurrency, or asynchrony, there remain significant challenges in the implementation of distributed systems. These programming challenges stem from the increasing complexity of the concurrent activities and reactive behaviors in a distributed system on the one hand, and the need to effectively leverage the parallelism offered by modern multi-core hardware, on the other hand. This thesis contributes Kompics, a programming model designed to alleviate some of these challenges. Kompics is a component model and programming framework for building distributed systems by composing message-passing concurrent components. Systems built with Kompics leverage multi-core machines out of the box, and they can be dynamically reconfigured to support hot software upgrades. A simulation framework enables deterministic execution replay for debugging, testing, and reproducible behavior evaluation for largescale Kompics distributed systems. The same system code is used for both simulation and production deployment, greatly simplifying the system development, testing, and debugging cycle. We highlight the architectural patterns and abstractions facilitated by Kompics through a case study of a non-trivial distributed key-value storage system. CATS is a scalable, fault-tolerant, elastic, and self-managing key-value store which trades off service availability for guarantees of atomic data consistency and tolerance to network partitions. We present the composition architecture for the numerous protocols employed by the CATS system, as well as our methodology for testing the correctness of key CATS algorithms using the Kompics simulation framework. Results from a comprehensive performance evaluation attest that CATS achieves its claimed properties and delivers a level of performance competitive with similar systems which provide only weaker consistency guarantees. More importantly, this testifies that Kompics admits efficient system implementations. Its use as a teaching framework as well as its use for rapid prototyping, development, and evaluation of a myriad of scalable distributed systems, both within and outside our research group, confirm the practicality of Kompics. / <p>QC 20130520</p> distributed systems programming model message-passing concurrency nested hierarchical composition reactive components software architecture dynamic reconfiguration multi-core discrete-event simulation peer-to-peer testing debugging distributed key-value stores data replication consistency linearizability network partition tolerance consistent hashing self-organization scalability elasticity fault tolerance consistent quorums
30	Distribuerade datalagringssystem för tjänsteleverantörer : Undersökning av olika användningsfall för distribuerade datalagringssystem / Distributed Data Storage Systems for Service Providers : Investigation of different use cases for distributed data storage systems Ahmed, Tanvir Saif, Markovic, Bratislav January 2016 (has links) Detta examensarbete handlar om undersökning av tre olika användningsfall inom datalagring; Cold Storage, High Performance Storage och Virtual Machine Storage. Rapporten har som syfte att ge en översikt över kommersiella distribuerade filsystem samt en djupare undersökning av distribuerade filsystem som bygger på öppen källkod och därmed hitta en optimal lösning för dessa användnings-fall. I undersökningen ingick att analysera och jämföra tidigare arbeten där jämförelser mellan pre-standamätningar, dataskydd och kostnader utfördes samt lyfta upp diverse funktionaliteter (snapshotting, multi-tenancy, datadeduplicering, datareplikering) som moderna distribuerade filsy-stem kännetecknas av. Både kommersiella och öppna distribuerade filsystem undersöktes. Även en kostnadsuppskattning för kommersiella och öppna distribuerade filsystem gjordes för att ta reda på lönsamheten för dessa två typer av distribuerat filsystem.Efter att jämförelse och analys av olika tidigare arbeten utfördes, visade sig att det öppna distribue-rade filsystemet Ceph lämpade sig bra som en lösning utifrån kraven som sattes som mål för High Performance Storage och Virtual Machine Storage. Kostnadsuppskattningen visade att det var mer lönsamt att implementera ett öppet distribuerat filsystem. Denna undersökning kan användas som en vägledning vid val mellan olika distribuerade filsystem. / In this thesis, a study of three different uses cases has been made within the field of data storage, which are as following: Cold Storage, High Performance Storage and Virtual Machine Storage. The purpose of the survey is to give an overview of commercial distributed file systems and a deeper study of open source codes distributed file systems in order to find the most optimal solution for these use cases. Within the study, previous works concerning performance, data protection and costs were an-alyzed and compared in means to find different functionalities (snapshotting, multi-tenancy, data duplication and data replication) which distinguish modern distributed file systems. Both commercial and open distributed file systems were examined. A cost estimation for commercial and open distrib-uted file systems were made in means to find out the profitability for these two types of distributed file systems.After comparing and analyzing previous works, it was clear that the open source distributed file sys-tem Ceph was proper as a solution in accordance to the objectives that were set for High Performance Storage and Virtual Machine Storage. The cost estimation showed that it was more profitable to im-plement an open distributed file system. This study can be used as guidance to choose between different distributed file systems. Cold Storage High Performance Storage Virtual Machine Storage uses cases snapshotting multi-tenancy data deduplication data replication distributed file systems Cold Storage High Performance Storage Virtual Machine Storage användningsfall snapshotting multi-tenancy datadeduplicering datareplikering distribuerade filsystem Computer Systems Datorsystem Computer Engineering Datorteknik

Search results