Global ETD Search

1	Comparative analysis of PropertyFirst vs. EntityFirst modeling approaches in graph databases 2015 March 1900 (has links) While relational databases still hold the primary position in the database technology domain, and have been for the longest time of any Computer Science technology has since its inception, for the first time the relational databases now have valid and worthy opponent in the NoSQL database movement. NoSQL databases, even though not many people have heard of them, with a significant number of Computer Science people included, have spread rapidly in many shapes and forms and have done so in quite a chaotic fashion. Similarly to the way they appeared and spread, design and modeling for them have been undertaken in an unstructured manner. Currently they are subcategorized in 4 main groups as: Key-value stores, Column Family stores, Document stores and Graph databases. In this thesis, different modeling approaches for graph databases, applied to the same domain are analyzed and compared, especially from a design perspective. The database selected here as the implemented technology is Neo4J by Neo Technologies and is a directed property graph database, which means that relationships between its data entities must have a starting and ending (or source and destination) node. This research provides an overview of two competing modeling approaches and evaluates them in a context of a real world example. The work done here shows that both of these modeling approaches are valid and that it is possible to fully develop a data model based on the same domain data with both approaches and that both can be used later to support application access in a similar fashion. One of the models provides for faster access to data, but at a cost of higher maintenance and increased complexity. NoSQL, Graph Databases, modeling
2	Distributed Graph Storage And Querying System Balaji, Janani 12 August 2016 (has links) Graph databases offer an efficient way to store and access inter-connected data. However, to query large graphs that no longer fit in memory, it becomes necessary to make multiple trips to the storage device to filter and gather data based on the query. But I/O accesses are expensive operations and immensely slow down query response time and prevent us from fully exploiting the graph specific benefits that graph databases offer. The storage models of most existing graph database systems view graphs as indivisible structures and hence do not allow a hierarchical layering of the graph. This adversely affects query performance for large graphs as there is no way to filter the graph on a higher level without actually accessing the entire information from the disk. Distributing the storage and processing is one way to extract better performance. But current distributed solutions to this problem are not entirely effective, again due to the indivisible representation of graphs adopted in the storage format. This causes unnecessary latency due to increased inter-processor communication. In this dissertation, we propose an optimized distributed graph storage system for scalable and faster querying of big graph data. We start with our unique physical storage model, in which the graph is decomposed into three different levels of abstraction, each with a different storage hierarchy. We use a hybrid storage model to store the most critical component and restrict the I/O trips to only when absolutely necessary. This lets us actively make use of multi-level filters while querying, without the need of comprehensive indexes. Our results show that our system outperforms established graph databases for several class of queries. We show that this separation also eases the difficulties in distributing graph data and go on propose a more efficient distributed model for querying general purpose graph data using the Spark framework. Graph Databases Distributed Graph Databases Distributed Graph Query Processing Spark
3	DistNeo4j: Scaling Graph Databases through Dynamic Distributed Partitioning Nicoara, Daniel 14 October 2014 (has links) Social networks are large graphs which require multiple servers to store and manage them. Providing performant scalable systems that store these graphs through partitioning them into subgraphs is an important issue. In such systems each partition is hosted by a server to satisfy multiple objectives. These objectives include balancing server loads, reducing remote traversals (number of edges cut), and adapting the partitioning to changes in the structure of the graph in the face of changing workloads. To address these issues, a dynamic repartitioning algorithm is required to modify an existing partitioning to maintain good quality partitions. Such a repartitioner should not impose a significant overhead to the system. This thesis introduces a greedy repartitioner, which dynamically modifies a partitioning using a small amount of resources. In contrast to the existing repartitioning algorithms, the greedy repartitioner is performant (in terms of time and memory), making it suitable for implementing and using it in a real system. The greedy repartitioner is integrated into DistNeo4j, which is designed as an extension of the open source Neo4j graph database system, to support workloads over partitioned graph data distributed over multiple servers. Using real-world data sets, this thesis shows that DistNeo4j leverages the greedy repartitioner to maintain high quality partitions and provides a 2 to 3 times performance improvement over the de-facto hash-based partitioning. Graph databases Distributed systems Re-partitioning
4	Generování rodokmenů z matričních záznamů / Family Trees Making from Parish Records Tušimová, Lucia January 2020 (has links) This work discusses the field of genealogy, different types of records and data in them. The thesis describes the topic of comparison of data and record linkage. It further it also discusses the design and implementation of the resulting system. The developed system connects people from parish records to larger pedigrees. These are then stored in the form of a graph database. The success of the interconnection of records was tested on the provided data sets.
5	Computing Label-Constraint Reachability in Graph Databases HONG, HUI 16 April 2012 (has links) No description available. Computer Science reachability graph databases algorithm
6	Analysis and Experimental Comparison of Graph Databases / Analysis and Experimental Comparison of Graph Databases Kolomičenko, Vojtěch January 2013 (has links) In the recent years a new type of NoSQL databases, called Graph databases (GDBs), has gained significant popularity due to the increasing need of processing and storing data in the form of a graph. The objective of this thesis is a research on possibilities and limitations of GDBs and conducting an experimental comparison of selected GDB implementations. For this purpose the requirements of a universal GDB benchmark have been formulated and an extensible benchmarking tool, named BlueBench, has been developed.
7	Semantic Assistance for Data Utilization and Curation Becker, Brian J 06 August 2013 (has links) We propose that most data stores for large organizations are ill-designed for the future, due to limited searchability of the databases. The study of the Semantic Web has been an emerging technology since first proposed by Berners-Lee. New vocabularies have emerged, such as FOAF, Dublin Core, and PROV-O ontologies. These vocabularies, combined, can relate people, places, things, and events. Technologies developed for the Semantic Web, namely the standardized vocabularies for expressing metadata, will make data easier to utilize. We gathered use cases for various data sources, from human resources to big enterprise. Most of our use cases reflect real-world data. We developed a software package for transforming data into these semantic vocabularies, and developed a method of querying via graphical constructs. The development and testing proved itself to be useful. We conclude that data can be preserved or revived through the use of the metadata techniques for the Semantic Web. Ontology Graph Databases RDF Provenance Hyperspectral Inference Databases and Information Systems
8	Historisation de données dans les bases de données NoSQLorientées graphes / Historical management in NoSQL Graph Databases Castelltort, Arnaud 30 September 2014 (has links) Cette thèse porte sur l'historisation des données dans les bases de données graphes. La problématique des données en graphes existe depuis longtemps mais leur exploitation par des moteurs de système de gestion de bases de données, principalement dans les moteurs NoSQL, est récente. Cette apparition est notamment liée à l'émergence des thématiques Big Data dont les propriétés intrinsèques, souvent décrites à l'aide des propriétés 3V (variété, volume, vélocité), ont révélé les limites des bases de données relationnelles classiques. L'historisation quant à elle, est un enjeu majeur des SI qui a été longtemps abordé seulement pour des raisons techniques de sauvegarde, de maintenance ou plus récemment pour des raisons décisionnelles (suites applicatives de Business Intelligence). Cependant, cet aspect s'avère maintenant prendre une place prédominante dans les applications de gestion. Dans ce contexte, les bases de données graphes qui sont de plus en plus utilisées n'ont que très peu bénéficié des apports récents de l'historisation. La première contribution consiste à étudier le nouveau poids des données historisées dans les SI de gestion. Cette analyse repose sur l'hypothèse selon laquelle les applications de gestion intègrent de plus en plus en leur sein les enjeux d'historisation. Nous discutons ce positionnement au regard de l'analyse de l'évolution des SI par rapport à cette problématique. La deuxième contribution vise, au-delà de l'étude de l'évolution des sytèmes d'information, à proposer un modèle innovant de gestion de l'historisation dans les bases de données NoSQL en graphes. Cette proposition consiste d'une part en l'élaboration d'un système unique et générique de représentation de l'historique au sein des BD NoSQL en graphes et d'autre part à proposer des modes d'interrogation (requêtes). Nous montrons qu'il est possible d'utiliser ce système aussi bien pour des requêtes simples (c'est-à-dire correspondant à ce que l'on attend en première intention d'un système d'historisation~: récupérer les précédentes versions d'une donnée) mais aussi de requêtes plus complexes qui permettent de tirer parti aussi bien de la notion d'historisation que des possibilités offertes par les bases de données graphes (par exemple, la reconnaissance de motifs dans le temps). / This thesis deals with data historization in the context of graphs. Graph data have been dealt with for many years but their exploitation in information systems, especially in NoSQL engines, is recent. The emerging Big Data and 3V contexts (Variety, Volume, Velocity) have revealed the limits of classical relational databases. Historization, on its side, has been considered for a long time as only linked with technical and backups issues, and more recently with decisional reasons (Business Intelligence). However, historization is now taking more and more importance in management applications.In this framework, graph databases that are often used have received little attention regarding historization. Our first contribution consists in studying the impact of historized data in management information systems. This analysis relies on the hypothesis that historization is taking more and more importance. Our second contribution aims at proposing an original model for managing historization in NoSQL graph databases.This proposition consists on the one hand in elaborating a unique and generic system for representing the history and on the other hand in proposing query features.We show that the system can support both simple and complex queries.Our contributions have been implemented and tested over synthetic and real databases. Bases de données graphes Historisation Gestion Graph Databases Historization Management
9	Implementing the GraphQL Interface on top of a Graph Database Mattsson, Linn January 2020 (has links) Since becoming an open source project in 2015, GraphQL has gained popularity as it is used as a query language from front-end to back-end, ensuring that no over-fetching or under-fetching is performed. While the query language has been openly available for a few years, there has been little academic research in this area. The aim of this thesis is to create an approach for using GraphQL on top of a graph database, as well as evaluate the optimisation techniques available for this approach. This was done by developing logical plans and query executions plans, and the suitable optimisation technique was found to be parallel execution and batching of database calls. The implementation was done in Java by using graph computing framework Apache TinkerPop, which is compatible with a number of graph databases. However, this implementation focuses on graph database management system Neo4j. To evaluate the implementation, query templates and data from Linköping GraphQL Benchmark was used. The logical plans were created by converting a GraphQL query into a tree of logical operators. The query execution plans were based on four different primitives from the Apache TinkerPop framework, and the physical operators were each influenced by one or more logical operators. The performance tests of the implementation showed that the query execution times were largely dependant on the query template as well as the number of database nodes visited. The pattern between execution times and the number of threads used in the parallel execution was concluded as lower execution times (<100 ms) were improved when 4-6 threads are used, while higher execution times were improved for 12-24 threads used. For the very fast query executions (<5 ms), using threading caused more overhead than the time saved by parallel execution, and for these cases it was better to not use any threading. GraphQL Graph Databases Performance Logical Plans Computer Sciences Datavetenskap (datalogi)
10	Aplicação de conceitos de bancos de dados de grafos e relacional na criação de proposta e análise comparativa de abordagens para armazenamento de processos / A proposal for storage of processes between different databases Viégas, Rafael Pedroni January 2018 (has links) Em busca da documentação e otimização de seus processos, a área de Business Process Management (BPM) vem cada vez mais atraindo o interesse do meio empresarial, por ser um importante método no auxílio ao ganho de resultados, como redução de custos e aumento de produtividade. Modelar processos, entretanto, não basta. É preciso que se atente para métodos eficientes de armazená-los, permitindo que as informações sejam manipuladas e utilizadas de maneira prática e inteligente. A presente dissertação propõe duas abordagens para armazenamento de modelos de processo, uma em bancos de dados relacionais e outra em bancos de dados orientados a grafos, comparando-os através de aspectos como desempenho na execução das operações e proximidade da abordagem de cada um deles com os modelos de processos. Enquanto os bancos de dados relacionais são mais populares, sendo utilizados na maior parte das aplicações atuais, os bancos de dados orientados a grafos possuem propriedades e representação gráfica semelhantes aos modelos de processos. Foram realizados testes que visam analisar o desempenho de ambas as abordagens, além da facilidade dos usuários em interagir com os modelos propostos. Os resultados deste estudo podem ser utilizados para a criação de repositórios que compartilhem processos de maneira eficiente, bem como incentivar o estudo de novas maneiras para o armazenamento de processos. / Business Process Management (BPM) area has been increasingly attracted the interest of the business community because users are looking for documentation and optimization. These documents can be an important method in helping to gain results such as reduced costs and increased productivity. However, to model processes is not enough. It is necessary to pay attention to efficient storage methods, allowing information to be handled and used in a practical and intelligent way. The present article compares the use of relational databases and graph databases, considering aspects such as performance in the execution of operations and proximity of the approach of each of them with the process models. While relational databases are more popular, being used in most of the current applications, graph databases have properties and graphical representations similar to process models. The results of this study can be used to create repositories which can both share process efficiently, and encourage the study of new ways of storing processes. Banco : Dados Grafos Business process management storage Relational databases Graph databases

Search results