Global ETD Search

31	Semantically-enabled stream processing and complex event processing over RDF graph streams / Traitement de flux sémantiquement activé et traitement d'évènements complexes sur des flux de graphe RDF Gillani, Syed 04 November 2016 (has links) Résumé en français non fourni par l'auteur. / There is a paradigm shift in the nature and processing means of today’s data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm – both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. In the past decade or so, a number of scalable and performance intensive DSMSs and CEP systems have been proposed. Most of them, however, are based on the relational data models – which begs the question for the support of heterogeneous data sources, i.e., variety of the data. Work in RDF stream processing (RSP) systems partly addresses the challenge of variety by promoting the RDF data model. Nonetheless, challenges like volume and velocity are overlooked by existing approaches. These challenges require customised optimisations which consider RDF as a first class citizen and scale the processof continuous graph pattern matching. To gain insights into these problems, this thesis focuses on developing scalable RDF graph stream processing, and semantically-enabled CEP systems (i.e., Semantic Complex Event Processing, SCEP). In addition to our optimised algorithmic and data structure methodologies, we also contribute to the design of a new query language for SCEP. Our contributions in these two fields are as follows: • RDF Graph Stream Processing. We first propose an RDF graph stream model, where each data item/event within streams is comprised of an RDF graph (a set of RDF triples). Second, we implement customised indexing techniques and data structures to continuously process RDF graph streams in an incremental manner. • Semantic Complex Event Processing. We extend the idea of RDF graph stream processing to enable SCEP over such RDF graph streams, i.e., temporalpattern matching. Our first contribution in this context is to provide a new querylanguage that encompasses the RDF graph stream model and employs a set of expressive temporal operators such as sequencing, kleene-+, negation, optional,conjunction, disjunction and event selection strategies. Based on this, we implement a scalable system that employs a non-deterministic finite automata model to evaluate these operators in an optimised manner. We leverage techniques from diverse fields, such as relational query optimisations, incremental query processing, sensor and social networks in order to solve real-world problems. We have applied our proposed techniques to a wide range of real-world and synthetic datasets to extract the knowledge from RDF structured data in motion. Our experimental evaluations confirm our theoretical insights, and demonstrate the viability of our proposed methods Traitement de flux Traitement d'évènements complexes Graphes RDF Optimisations de question Ebauche de requête Web sémantique Requêtes top-k Données de graphes Stream processing Complex event processing RDF graphs Query optimisations Query design Semantic web Top-k queries Graph databases
32	Considering User Intention in Differential Graph Queries Vasilyeva, Elena, Thiele, Maik, Bornhövd, Christof, Lehner, Wolfgang 30 November 2020 (has links) Empty answers are a major problem by processing pattern matching queries in graph databases. Especially, there can be multiple reasons why a query failed. To support users in such situations, differential queries can be used that deliver missing parts of a graph query. Multiple heuristics are proposed for differential queries, which reduce the search space. Although they are successful in increasing the performance, they can discard query subgraphs relevant to a user. To address this issue, the authors extend the concept of differential queries and introduce top-k differential queries that calculate the ranking based on users’ preferences and significantly support the users’ understanding of query database management systems. A user assigns relevance weights to elements of a graph query that steer the search and are used for the ranking. In this paper the authors propose different strategies for selection of relevance weights and their propagation. As a result, the search is modelled along the most relevant paths. The authors evaluate their solution and both strategies on the DBpedia data graph. info:eu-repo/classification/ddc/330 ddc:330 info:eu-repo/classification/ddc/004 ddc:004
33	Graphdatenbanken für die textorientierten e-Humanities Efer, Thomas 08 February 2017 (has links) Vor dem Hintergrund zahlreicher Digitalisierungsinitiativen befinden sich weite Teile der Geistes- und Sozialwissenschaften derzeit in einer Transition hin zur großflächigen Anwendung digitaler Methoden. Zwischen den Fachdisziplinen und der Informatik zeigen sich große Differenzen in der Methodik und bei der gemeinsamen Kommunikation. Diese durch interdisziplinäre Projektarbeit zu überbrücken, ist das zentrale Anliegen der sogenannten e-Humanities. Da Text der häufigste Untersuchungsgegenstand in diesem Feld ist, wurden bereits viele Verfahren des Text Mining auf Problemstellungen der Fächer angepasst und angewendet. Während sich langsam generelle Arbeitsabläufe und Best Practices etablieren, zeigt sich, dass generische Lösungen für spezifische Teilprobleme oftmals nicht geeignet sind. Um für diese Anwendungsfälle maßgeschneiderte digitale Werkzeuge erstellen zu können, ist eines der Kernprobleme die adäquate digitale Repräsentation von Text sowie seinen vielen Kontexten und Bezügen. In dieser Arbeit wird eine neue Form der Textrepräsentation vorgestellt, die auf Property-Graph-Datenbanken beruht – einer aktuellen Technologie für die Speicherung und Abfrage hochverknüpfter Daten. Darauf aufbauend wird das Textrecherchesystem „Kadmos“ vorgestellt, mit welchem nutzerdefinierte asynchrone Webservices erstellt werden können. Es bietet flexible Möglichkeiten zur Erweiterung des Datenmodells und der Programmfunktionalität und kann Textsammlungen mit mehreren hundert Millionen Wörtern auf einzelnen Rechnern und weitaus größere in Rechnerclustern speichern. Es wird gezeigt, wie verschiedene Text-Mining-Verfahren über diese Graphrepräsentation realisiert und an sie angepasst werden können. Die feine Granularität der Zugriffsebene erlaubt die Erstellung passender Werkzeuge für spezifische fachwissenschaftliche Anwendungen. Zusätzlich wird demonstriert, wie die graphbasierte Modellierung auch über die rein textorientierte Forschung hinaus gewinnbringend eingesetzt werden kann. / In light of the recent massive digitization efforts, most of the humanities disciplines are currently undergoing a fundamental transition towards the widespread application of digital methods. In between those traditional scholarly fields and computer science exists a methodological and communicational gap, that the so-called \\\"e-Humanities\\\" aim to bridge systematically, via interdisciplinary project work. With text being the most common object of study in this field, many approaches from the area of Text Mining have been adapted to problems of the disciplines. While common workflows and best practices slowly emerge, it is evident that generic solutions are no ultimate fit for many specific application scenarios. To be able to create custom-tailored digital tools, one of the central issues is to digitally represent the text, as well as its many contexts and related objects of interest in an adequate manner. This thesis introduces a novel form of text representation that is based on Property Graph databases – an emerging technology that is used to store and query highly interconnected data sets. Based on this modeling paradigm, a new text research system called \\\"Kadmos\\\" is introduced. It provides user-definable asynchronous web services and is built to allow for a flexible extension of the data model and system functionality within a prototype-driven development process. With Kadmos it is possible to easily scale up to text collections containing hundreds of millions of words on a single device and even further when using a machine cluster. It is shown how various methods of Text Mining can be implemented with and adapted for the graph representation at a very fine granularity level, allowing the creation of fitting digital tools for different aspects of scholarly work. In extended usage scenarios it is demonstrated how the graph-based modeling of domain data can be beneficial even in research scenarios that go beyond a purely text-based study. info:eu-repo/classification/ddc/500 ddc:500
34	Einsatz von Graphdatenbanken für das Produktdatenmanagement im Kontext von Industrie 4.0 Sauer, Christopher, Schleich, Benjamin, Wartzack, Sandro 03 January 2020 (has links) Im Zuge der digitalen Transformation im Kontext von Industrie 4.0 tun sich eine Vielzahl neuer Datenquellen auf, die im Produktdatenmanagement berücksichtigt werden müssen. Ein Beispiel neuer Datenquellen sind Daten der Industrie 4.0, die zum Beispiel über Sensoren in der Fertigung erhoben werden. Kennzeichen dieser Datenquellen sind die zunehmende Heterogenität der Daten, die nicht mehr in einer Tabelle erfasst werden können. So könnten dies unter anderem Bilder einer optischen Bauteilprüfung sein oder Code zur Bauteilprüfung. Dieser Umstand führt zum Aufbau vieler einzelner neuer Silos, in denen die Daten separat und getrennt vom PDM-System ver-rbeitet werden müssen. Zudem werden dort abgeschottet von den restlichen Silos Daten gespeichert. Daneben führt eine Vielzahl neuer Autorensysteme (Prüfsoftware, Kundenmanagement, Anforderungsmanagement) zu einer gesteigerten Datenmenge, die nicht mehr in klassischen tabellenbasierten und rein-relationalen Datenbanksystemen sinnvoll erfasst werden können. Um an Informationen zu gelangen, sind im Fall rein-relationaler Datenbanksysteme oft komplizierte Abfragen nötig. Diese greifen dann auf mehrere unterschiedliche Tabellen innerhalb der Datenbank zu und stellen daraus wiederum relevante Informationen bereit. Je mehr größer jedoch diese Datenbanken werden und je mehr Informationen miteinander relational verbunden werden müssen, desto mehr Expertenwissen über das jeweilige Datenbanksystem wird benötigt. Somit büßen rein-relationale (SQL-basierte) Systeme auch einen Großteil der Vorteile ihres logischen strukturellen Aufbaus ein. Um den oben genannten Problemen zu begegnen, können neue Ansätze aus dem Bereich der Linked Data herangezogen werden. Bei Linked Data werden nicht nur die reinen Daten verwendet, sondern auch beschreibende und verknüpfende Informationen um die Daten zu interpretieren verwendet und weitergegeben. Durch diesen Mehrwert an Information wird es in einem ersten Schritt möglich, heterogene Produkt- und Prozessdaten, also Daten aus verschiedensten Quellen, wie zum Beispiel Konstruktion, Simulation und Qualitätssicherung, miteinander zu verknüpfen. Durch diese Verknüpfung kann eine höherwertige Darstellungsform geschaffen werden, die neben den reinen Daten auch die sinnvolle Verknüpfung enthält und so eine semantisch höherwertige Repräsentation darstellt. Die so entstehende, vernetzte Datenbank kann z.B. über eine graphenorientierte Datenbank oder Graphdatenbank implementiert werden. Im vorliegenden Beitrag wird untersucht, inwieweit die Modellierung mit gegenwärtig existierenden Lösungen für Graphdatenbanken möglich ist. Ausgehend von einem Beispiel mit einem vereinfachten Produkt- und Prozessdatenmodell der Blechmassivumformung, wird eine allgemeine Methode vorgestellt, durch die ein SQL-basiertes Datenbanksystem in eine Graphdatenbank überführt werden kann. Anhand dieser Methode wird dargestellt, wie bestehende Lösungen teilweise auch parallel zu neuartigen Linked Data Datenbanken existieren können, um diese Schritt für Schritt in eine Graphdatenbank zu überführen. Die Ergebnisse des Beitrags sind auf der einen Seite das allgemeine Vorgehensmodell zur Einführung von Graphdatenbanken und auf der anderen Seite Aussagen über die Nutzbarkeit der vorgestellten Lösung für das Produkt- & Prozessdatenmanagement. [... aus der Einleitung] info:eu-repo/classification/ddc/620 ddc:620
35	Answering “Why Empty?” and “Why So Many?” queries in graph databases Vasilyeva, Elena, Thiele, Maik, Bornhövd, Christof, Lehner, Wolfgang 04 July 2023 (has links) Graph databases provide schema-flexible storage and support complex, expressive queries. However, the flexibility and expressiveness in these queries come at additional costs: queries can result in unexpected empty answers or too many answers, which are difficult to resolve manually. To address this, we introduce subgraph-based solutions for graph queries “Why Empty?” and “Why So Many?” that give an answer about which part of a graph query is responsible for an unexpected result. We also extend our solutions to consider the specifics of the used graph model and to increase efficiency and experimentally evaluate them in an in-memory column database. info:eu-repo/classification/ddc/004 ddc:004

Page generated in 0.0597 seconds