Spelling suggestions: "subject:"[een] HASH TABLE"" "subject:"[enn] HASH TABLE""
31 |
Chord - A Distributed Hash TableLiao, Yimei 24 July 2006 (has links)
An introduction to Chord Algorithm.
|
32 |
Chord - A Distributed Hash TableLiao, Yimei 21 August 2007 (has links)
Source is converted into pdf format.
An introduction to Chord Algorithm.
|
33 |
Persistence and Node FailureRecovery in Strongly Consistent Key-Value DatastoreEhsan ul Haque, Muhammad January 2012 (has links)
Consistency preservation of replicated data is a critical aspect for distributed databaseswhich are strongly consistent. Further, in fail-recovery model each process also needs todeal with the management of stable storage and amnesia [1]. CATS is a key/value datastore which combines the Distributed Hash Table (DHT) like scalability and selforganization and also provides atomic consistency of the replicated items. However beingan in memory data store with consistency and partition tolerance (CP), it suffers frompermanent unavailability in the event of majority failure. The goals of this thesis were twofold (i) to implement disk persistent storage in CATS,which would allow the records and state of the nodes to be persisted on disk and (ii) todesign nodes failure recovery-algorithm for CATS which enable the system to run with theassumption of a Fail Recovery model without violating consistency. For disk persistent storage two existing key/value databases LevelDB [2] and BerkleyDB[3] are used. LevelDB is an implementation of log structured merged trees [4] where asBerkleyDB is an implementation of log structured B+ trees [5]. Both have been used as anunderlying local storage for nodes and throughput and latency of the system with each isdiscussed. A technique to improve the performance by allowing concurrent operations onthe nodes is also discussed. The nodes failure-recovery algorithm is designed with a goalto allow the nodes to crash and then recover without violating consistency and also toreinstate availability once the majority of nodes recover. The recovery algorithm is based onpersisting the state variables of Paxos [6] acceptor and proposer and consistent groupmemberships. For fault-tolerance and recovery, processes also need to copy records from the replicationgroup. This becomes problematic when the number of records and the amount of data ishuge. For this problem a technique for transferring key/value records in bulk is alsodescribed, and its effect on the latency and throughput of the system is discussed.
|
34 |
A Machine-Checked Proof of Correctness of Pastry / Une preuve certifiée par la machine de la correction du protocole PastryAzmy, Noran 24 November 2016 (has links)
Les réseaux pair-à-pair (P2P) constituent un modèle de plus en plus populaire pour la programmation d’applications Internet car ils favorisent la décentralisation, le passage à l’échelle, la tolérance aux pannes et l’auto-organisation. à la différence du modèle traditionnel client-serveur, un réseau P2P est un système réparti décentralisé dans lequel tous les nœuds interagissent directement entre eux et jouent à la fois les rôles de fournisseur et d’utilisateur de services et de ressources. Une table de hachage distribuée (DHT) est réalisée par un réseauP2P et offre les mêmes services qu’une table de hachage classique, hormis le fait que les différents couples (clef, valeur) sont stockés dans différents nœuds du réseau. La fonction principale d’une DHT est la recherche d’une valeur associée à une clef donnée. Parmi les protocoles réalisant une DHT on peut nommer Chord, Pastry, Kademlia et Tapestry. Ces protocoles promettent de garantir certaines propriétés de correction et de performance ; or, les tentatives de démontrer formellement de telles propriétés se heurtent invariablement à des cas limites dans lesquels certaines propriétés sont violées. Tian-xiang Lu a ainsi décrit des problèmes de correction dans des versions publiées de Pastry. Il a conçu un modèle, appelé LuPastry, pour lequel il a fourni une preuve partielle, mécanisée dans l’assistant à la preuve TLA+ Proof System, démontrant que les messages de recherche de clef sont acheminés au bon nœud du réseau dans le cas sans départ de nœuds. En analysant la preuve de Lu j’ai découvert qu’elle contenait beaucoup d’hypothèses pour lesquelles aucune preuve n’avait été fournie, et j’ai pu trouver des contre-exemples à plusieurs de ces hypothèses. La présente thèse apporte trois contributions. Premièrement, je présente LuPastry+, une spécification TLA+ revue de LuPastry. Au-delà des corrections nécessaires d’erreurs, LuPastry+ améliore LuPastry en introduisant de nouveaux opérateurs et définitions, conduisant à une spécification plus modulaire et isolant la complexité de raisonnement à des parties circonscrites de la preuve, contribuant ainsi à automatiser davantage la preuve. Deuxièmement, je présente une preuve TLA+ complète de l’acheminement correct dans LuPastry+. Enfin, je démontre que l’étape finale du processus d’intégration de nœuds dans LuPastry (et LuPastry+) n’est pas nécessaire pour garantir la cohérence du protocole. Concrètement, j’exhibe une nouvelle spécification avec un processus simplifié d’intégration de nœuds, que j’appelle Simplified LuPastry+, et je démontre qu’elle garantit le bon acheminement de messages de recherche de clefs. La preuve de correction pour Simplified LuPastry+ est obtenue en réutilisant la preuve pour LuPastry+, et ceci représente un bon succès pour la réutilisation de preuves, en particulier considérant la taille de ces preuves. Chacune des deux preuves requiert plus de 30000 étapes interactives ; à ma connaissance, ces preuves constituent les preuves les plus longues écrites dans le langage TLA+ à ce jour, et les seuls exemples d’application de preuves mécanisées de théorèmes pour la vérification de protocoles DHT / A distributed hash table (DHT) is a peer-to-peer network that offers the function of a classic hash table, but where different key-value pairs are stored at different nodes on the network. Like a classic hash table, the main function provided by a DHT is key lookup, which retrieves the value stored at a given key. Examples of DHT protocols include Chord, Pastry, Kademlia and Tapestry. Such DHT protocols certain correctness and performance guarantees, but formal verification typically discovers border cases that violate those guarantees. In his PhD thesis, Tianxiang Lu reported correctness problems in published versions of Pastry and developed a model called {\LP}, for which he provided a partial proof of correct delivery of lookup messages assuming no node failure, mechanized in the {\TLA} Proof System. In analyzing Lu's proof, I discovered that it contained unproven assumptions, and found counterexamples to several of these assumptions. The contribution of this thesis is threefold. First, I present {\LPP}, a revised {\TLA} specification of {\LP}. Aside from needed bug fixes, {\LPP} contains new definitions that make the specification more modular and significantly improve proof automation. Second, I present a complete {\TLA} proof of correct delivery for {\LPP}. Third, I prove that the final step of the node join process of {\LP}/{\LPP} is not necessary to achieve consistency. In particular, I develop a new specification with a simpler node join process, which I denote by {\SLP}, and prove correct delivery of lookup messages for this new specification. The proof of correctness of {\SLP} is written by reusing the proof for {\LPP}, which represents a success story in proof reuse, especially for proofs of this size. Each of the two proofs amounts to over 32,000 proof steps; to my knowledge, they are currently the largest proofs written in the {\TLA} language, and---together with Lu's proof---the only examples of applying full theorem proving for the verification of DHT protocols
|
35 |
The design and implementation of a robust, cost-conscious peer-to-peer lookup serviceHarvesf, Cyrus Mehrabaun 17 November 2008 (has links)
Peer-to-peer (p2p) technology provides an excellent platform for the delivery of rich content and media that scales with the rapid growth of the Internet. This work presents a lookup service design and implementation that provides provable fault tolerance and operates in a cost-conscious manner over the Internet.
<br><br>
Using a distributed hash table (DHT) as a foundation, we propose a replica placement that improves object availability and reachability to implement a robust lookup service. We present a framework that describes tree-based routing DHTs and formally prove several properties for DHTs of this type. Specifically, we prove that our replica placement, which we call MaxDisjoint, creates a provable number of disjoint routes from any source node to a replica set. We evaluate this technique through simulation and demonstrate that it creates disjoint routes more effectively than existing replica placements. Furthermore, we show that disjoint routes have a marked impact on routing robustness, which we measure as the probability of lookup success.
<br><br>
To mitigate the costs incurred by multi-hop DHT routing, we develop an organization-based id assignment scheme that bounds the transit costs of prefix-matching routes. To further reduce costs, we use MaxDisjoint placement to create multiple routes of varying costs. This technique helps reduce cost in two ways: (1) replication may create local copies of an object that can be accessed at zero transit cost and (2) MaxDisjoint replication creates multiple, bounded cost, disjoint routes of which the minimal cost route can be used to resolve the lookup. We model the trade-off between the storage cost and routing cost benefit of replication to find the optimal degree to which an object should be replicated. We evaluate our approach using a lookup service implementation and show that it dramatically reduces cost over existing DHT implementations. Furthermore, we show that our technique can be used to manage objects of varying popularity in a manner that is more cost effective than caching.
<br><br>
By improving its robustness and cost effectiveness, we aim to increase the pervasiveness of p2p in practice and unlock the potential of this powerful technology.
|
36 |
[en] LUAPS - LUA PUBLISH-SUBSCRIBE / [pt] LUAPS - LUA PUBLISH-SUBSCRIBEMARIO MENDES DE O ZIMMERMANN 24 July 2006 (has links)
[pt] Sistemas publish-subscribe são definidos por seu modelo
básico de comunicação. No entanto, a maior parte dos
sistemas publish-subscribe existentes
incorpora outros mecanismos em sua implementação. Este
trabalho
busca um melhor entendimento de sistemas publish-
subscribe, definindo
uma arquitetura onde diferentes camadas agrupam decisões e
construções
relacionadas. Baseado nesta arquitetura, descrevemos um
sistema desenvolvido
em Lua que utiliza uma tabela hash distribuída como base.
O sistema
se diferencia dos sistemas publish-subscribe monolíticos e
tem como
foco generalidade, flexibilidade e extensibilidade. / [en] Publish-subscribe systems are defined by its communication
model.
However, most of the existent publish-subscribe systems
incorporate
other mechanisms in their implementation. This work seeks
a better
understanding of publish-subscribe systems, defining an
architecture where
different layers group related decisions and constructions.
Based on this
architecture, we describe a system developed in Lua that
uses a distributed
hash table as its base. The system differs in its
architecture from
monolithic publish-subscribe systems and focus on
generality, flexibility and
extensibility.
|
37 |
Structured peer-to-peer overlays for NATed churn intensive networksChowdhury, Farida January 2015 (has links)
The wide-spread coverage and ubiquitous presence of mobile networks has propelled the usage and adoption of mobile phones to an unprecedented level around the globe. The computing capabilities of these mobile phones have improved considerably, supporting a vast range of third party applications. Simultaneously, Peer-to-Peer (P2P) overlay networks have experienced a tremendous growth in terms of usage as well as popularity in recent years particularly in fixed wired networks. In particular, Distributed Hash Table (DHT) based Structured P2P overlay networks offer major advantages to users of mobile devices and networks such as scalable, fault tolerant and self-managing infrastructure which does not exhibit single points of failure. Integrating P2P overlays on the mobile network seems a logical progression; considering the popularities of both technologies. However, it imposes several challenges that need to be handled, such as the limited hardware capabilities of mobile phones and churn (i.e. the frequent join and leave of nodes within a network) intensive mobile networks offering limited yet expensive bandwidth availability. This thesis investigates the feasibility of extending P2P to mobile networks so that users can take advantage of both these technologies: P2P and mobile networks. This thesis utilises OverSim, a P2P simulator, to experiment with the performance of various P2P overlays, considering high churn and bandwidth consumption which are the two most crucial constraints of mobile networks. The experiment results show that Kademlia and EpiChord are the two most appropriate P2P overlays that could be implemented in mobile networks. Furthermore, Network Address Translation (NAT) is a major barrier to the adoption of P2P overlays in mobile networks. Integrating NAT traversal approaches with P2P overlays is a crucial step for P2P overlays to operate successfully on mobile networks. This thesis presents a general approach of NAT traversal for ring based overlays without the use of a single dedicated server which is then implemented in OverSim. Several experiments have been performed under NATs to determine the suitability of the chosen P2P overlays under NATed environments. The results show that the performance of these overlays is comparable in terms of successful lookups in both NATed and non-NATed environments; with Kademlia and EpiChord exhibiting the best performance. The presence of NATs and also the level of churn in a network influence the routing techniques used in P2P overlays. Recursive routing is more resilient to IP connectivity restrictions posed by NATs but not very robust in high churn environments, whereas iterative routing is more suitable to high churn networks, but difficult to use in NATed environments. Kademlia supports both these routing schemes whereas EpiChord only supports the iterating routing. This undermines the usefulness of EpiChord in NATed environments. In order to harness the advantages of both routing schemes, this thesis presents an adaptive routing scheme, called Churn Aware Routing Protocol (ChARP), combining recursive and iterative lookups where nodes can switch between recursive and iterative routing depending on their lifetimes. The proposed approach has been implemented in OverSim and several experiments have been carried out. The experiment results indicate an improved performance which in turn validates the applicability and suitability of ChARP in NATed environments.
|
38 |
FreeCore : un système d'indexation de résumés de document sur une Table de Hachage Distribuée (DHT) / FreeCore : an index system of summary of documents on an Distributed Hash Table (DHT)Ngom, Bassirou 13 July 2018 (has links)
Cette thèse étudie la problématique de l’indexation et de la recherche dans les tables de hachage distribuées –Distributed Hash Table (DHT). Elle propose un système de stockage distribué des résumés de documents en se basant sur leur contenu. Concrètement, la thèse utilise les Filtre de Blooms (FBs) pour représenter les résumés de documents et propose une méthode efficace d’insertion et de récupération des documents représentés par des FBs dans un index distribué sur une DHT. Le stockage basé sur contenu présente un double avantage, il permet de regrouper les documents similaires afin de les retrouver plus rapidement et en même temps, il permet de retrouver les documents en faisant des recherches par mots-clés en utilisant un FB. Cependant, la résolution d’une requête par mots-clés représentée par un filtre de Bloom constitue une opération complexe, il faut un mécanisme de localisation des filtres de Bloom de la descendance qui représentent des documents stockés dans la DHT. Ainsi, la thèse propose dans un deuxième temps, deux index de filtres de Bloom distribués sur des DHTs. Le premier système d’index proposé combine les principes d’indexation basée sur contenu et de listes inversées et répond à la problématique liée à la grande quantité de données stockée au niveau des index basés sur contenu. En effet, avec l’utilisation des filtres de Bloom de grande longueur, notre solution permet de stocker les documents sur un plus grand nombre de serveurs et de les indexer en utilisant moins d’espace. Ensuite, la thèse propose un deuxième système d’index qui supporte efficacement le traitement des requêtes de sur-ensembles (des requêtes par mots-clés) en utilisant un arbre de préfixes. Cette dernière solution exploite la distribution des données et propose une fonction de répartition paramétrable permettant d’indexer les documents avec un arbre binaire équilibré. De cette manière, les documents sont répartis efficacement sur les serveurs d’indexation. En outre, la thèse propose dans la troisième solution, une méthode efficace de localisation des documents contenant un ensemble de mots-clés donnés. Comparé aux solutions de même catégorie, cette dernière solution permet d’effectuer des recherches de sur-ensembles en un moindre coût et constitue est une base solide pour la recherche de sur-ensembles sur les systèmes d’index construits au-dessus des DHTs. Enfin, la thèse propose le prototype d’un système pair-à-pair pour l’indexation de contenus et la recherche par mots-clés. Ce prototype, prêt à être déployé dans un environnement réel, est expérimenté dans l’environnement de simulation peersim qui a permis de mesurer les performances théoriques des algorithmes développés tout au long de la thèse. / This thesis examines the problem of indexing and searching in Distributed Hash Table (DHT). It provides a distributed system for storing document summaries based on their content. Concretely, the thesis uses Bloom filters (BF) to represent document summaries and proposes an efficient method for inserting and retrieving documents represented by BFs in an index distributed on a DHT. Content-based storage has a dual advantage. It allows to group similar documents together and to find and retrieve them more quickly at the same by using Bloom filters for keywords searches. However, processing a keyword query represented by a Bloom filter is a difficult operation and requires a mechanism to locate the Bloom filters that represent documents stored in the DHT. Thus, the thesis proposes in a second time, two Bloom filters indexes schemes distributed on DHT. The first proposed index system combines the principles of content-based indexing and inverted lists and addresses the issue of the large amount of data stored by content-based indexes. Indeed, by using Bloom filters with long length, this solution allows to store documents on a large number of servers and to index them using less space. Next, the thesis proposes a second index system that efficiently supports superset queries processing (keywords-queries) using a prefix tree. This solution exploits the distribution of the data and proposes a configurable distribution function that allow to index documents with a balanced binary tree. In this way, documents are distributed efficiently on indexing servers. In addition, the thesis proposes in the third solution, an efficient method for locating documents containing a set of keywords. Compared to solutions of the same category, the latter solution makes it possible to perform subset searches at a lower cost and can be considered as a solid foundation for supersets queries processing on over-dht index systems. Finally, the thesis proposes a prototype of a peer-to-peer system for indexing content and searching by keywords. This prototype, ready to be deployed in a real environment, is experimented with peersim that allowed to measure the theoretical performances of the algorithms developed throughout the thesis.
|
39 |
Robust, fault-tolerant majority based key-value data store supporting multiple data consistencyKhan, Tareq Jamal January 2011 (has links)
Web 2.0 has significantly transformed the way how modern society works now-a-days. In today‘s Web, information not only flows top down from the web sites to the readers; but also flows bottom up contributed by mass user. Hugely popular Web 2.0 applications like Wikis, social applications (e.g. Facebook, MySpace), media sharing applications (e.g. YouTube, Flickr), blogging and numerous others generate lots of user generated contents and make heavy use of the underlying storage. Data storage system is the heart of these applications as all user activities are translated to read and write requests and directed to the database for further action. Hence focus is on the storage that serves data to support the applications and its reliable and efficient design is instrumental for applications to perform in line with expectations. Large scale storage systems are being used by popular social networking services like Facebook, MySpace where millions of users‘ data have been stored and fully accessed by these companies. However from users‘ point of view there has been justified concern about user data ownership and lack of control over personal data. For example, on more than one occasions Facebook have exercised its control over users‘ data without respecting users‘ rights to ownership of their own content and manipulated data for its own business interest without users‘ knowledge or consent. The thesis proposes, designs and implements a large scale, robust and fault-tolerant key-value data storage prototype that is peer-to-peer based and intends to back away from the client-server paradigm with a view to relieving the companies from data storage and management responsibilities and letting users control their own personal data. Several read and write APIs (similar to Yahoo!‘s P NUTS but different in terms of underlying design and the environment they are targeted for) with various data consistency guarantees are provided from which a wide range of web applications would be able to choose the APIs according to their data consistency, performance and availability requirements. An analytical comparison is also made against the PNUTS system that targets a more stable environment. For evaluation, simulation has been carried out to test the system availability, scalability and fault-tolerance in a dynamic environment. The results are then analyzed and conclusion is drawn that the system is scalable, available and shows acceptable performance.
|
40 |
Lärandeanalys med automaträttning : En undersökning av studenters svårigheter att implementera hashtabeller i en grundkurs i datalogi / Learning Analytics On Automatic Evaluation : An investigation of students' difficulties with implementing hash tables in an undergraduate computer science courseEklund, Linus January 2023 (has links)
Den för datalogin grundläggande datastrukturen hashtabell är krävande att tillägna sig. En undersökning gjordes på en kurs i algoritmer och datastrukturer med 200 deltagare. Lärandemålet ”implementera hashtabell och hashfunktion” bröts ner i grundläggande färdigheter som testades i en automaträttad programmeringsuppgift med riktad återkoppling till studenterna. 86 studenter gjorde 334 försök att lösa uppgiften. Undersökningen visade att testerna som ingår i den automaträttade uppgiften svarar mot de fel studenterna gör. Studenternas fel kategoriserades efter de grundläggande färdigheter som tagit fram. Kategoriseringen kan användas för att identifiera svaga områden hos studenterna och modifiera undervisningen därefter. Försöken visar också att när uppgiften kräver samtidig tillämpning av två begrepp leder detta ofta till fel i implementationen av algoritmen eller ineffektiva lösningar. / The hash table data structure, which is fundamental to computer science, is demanding to learn. A survey was conducted in a course on algorithms and data structures with 200 participants. The learning outcome of implementing a hash table was broken down into basic skills that were tested in an automated programming task with targeted feedback to the students. 86 students made 334 attempts to solve the task. The study showed that the tests included in the automated task correspond to the errors made by the students. The students' errors were categorized according to the basic skills developed. The categorisation can be used to identify weak areas in the students and modify the teaching accordingly. The experiments also show that when the task requires the simultaneous application of two concepts, this often leads to errors in the implementation of the algorithm or inefficient solutions.
|
Page generated in 0.0635 seconds