Global ETD Search

1	GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Management Systems Paradies, Marcus, Lehner, Wolfgang, Bornhövd, Christof 25 August 2022 (has links) Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system. We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems. info:eu-repo/classification/ddc/004 ddc:004
2	GRATIN: Accelerating Graph Traversals in Main-Memory Column Stores Paradies, Marcus, Rudolf, Michael, Bornhövd, Christof, Lehner, Wolfgang 25 August 2022 (has links) Native graph query and processing capabilities have become indispensable for modern business applications in enterprise-critical operations on data that is stored in relational database management systems. Traversal operations are a basic ingredient of graph algorithms and graph queries. As a consequence, they are fundamental for querying graph data in a relational database management system. In this paper we present gratin, a concise secondary index structure to speedup graph traversals in main-memory column stores. Conventional approaches for graph traversals rely on repeated full column scans, making it an inefficient approach for deep traversals on very large graphs. To tackle this challenge, we devise a novel and adaptive block-based index to handle graphs efficiently. Most importantly, gratin is updateable in constant time and allows supporting evolving graphs with frequent updates to the graph topology. We conducted an extensive evaluation on real-world data sets from different domains for a large variety of traversal queries. Our experiments show improvements of up to an order of magnitude compared to a scan-based traversal algorithm. info:eu-repo/classification/ddc/004 ddc:004
3	AQuES / ein flexibles, verteiltes System zur Optimierung und Auswertung relationaler Anfragen Stillger, Michael 21 January 2000 (has links) Die parallele Anfragebearbeitung für relationale Datenbankmanagementsysteme (RDBMS) ist wegen ihrer unterschiedlichen Arten der Ausführungsparallelität und den Eigenschaften der zugrunde liegenden parallelen Architektur ein äusserst komplexes Problem. Systemänderungen zur Laufzeit der Anfrage können zusätzlich ein dynamisches Verhalten der ausführenden Komponenten erfordern, um eine nahezu optimale Antwortzeit zu gewährleisten. Diese Arbeit stellt einen neuen, flexiblen Ansatz für die Optimierung und Abarbeitung von komplexen Anfragen vor, der besonders die dynamische Optimierung berücksichtigt. Insbesondere werden in der Arbeit folgende Teile präsentiert: 1. die Architektur eines neuen, verteilt-kooperierenden Komponentensystems beeinflusst von agenten-orientierten Konzepten; 2. der Entwurf und die Realisierung einer neuen Kommunikationsinfrastruktur für die identifizierten Systemkomponenten; 3. der Entwurf und die Implementierung eines flexiblen Anfrageoptimierers mit einem neuen, zufallsbasierten Algorithmus; und 4. der Entwurf und die Realisierung einer parallel arbeitenden Ausführungskomponente unter besonderer Berücksichtigung der dynamischen Anfrageoptimierung. Bei der Entwicklung der Konzepte standen neben den spezifischen Anforderungen für RDBMS besonders die Konfigurierbarkeit und die Erweiterbarkeit des verteilten Systems im Vordergrund. / Parallel query evaluation for relational database management systems (RDBSM) still remains a challenging problem. Modern systems must show near optimal performance in spite of running in a heterogeneous hardware environment, exploiting different ways of parallelism and dealing with unpredictable system load. This thesis paper presents a dynamic and flexible system addressing the issues of optimization and evaluation of relational queries for a distributed and dynamic environment. In particular, this work consists of: 1) the architecture of a distributed system which was inspired by the concepts of software agents, 2) the architecture and the implementation of a communication infrastructure for the system components, 3) the architecture and the implementation of a new query optimization algorithm, and 4) the concept and the implementation of a new query evaluation engine for parallel execution, which enables runtime optimization of queries. Furthermore, the design supports the extension and the configuration of the system and its components. Anfragebearbeitung Datenbankmanagementsystem Anfrageoptimierung Software-Agenten database management systems query processing query optimization software agents 510 Mathematik 27 Mathematik ST 270 ddc:510
4	State Management for Efficient Event Pattern Detection Zhao, Bo 20 May 2022 (has links) Event Stream Processing (ESP) Systeme überwachen kontinuierliche Datenströme, um benutzerdefinierte Queries auszuwerten. Die Herausforderung besteht darin, dass die Queryverarbeitung zustandsbehaftet ist und die Anzahl von Teilübereinstimmungen mit der Größe der verarbeiteten Events exponentiell anwächst. Die Dynamik von Streams und die Notwendigkeit, entfernte Daten zu integrieren, erschweren die Zustandsverwaltung. Erstens liefern heterogene Eventquellen Streams mit unvorhersehbaren Eingaberaten und Queryselektivitäten. Während Spitzenzeiten ist eine erschöpfende Verarbeitung unmöglich, und die Systeme müssen auf eine Best-Effort-Verarbeitung zurückgreifen. Zweitens erfordern Queries möglicherweise externe Daten, um ein bestimmtes Event für eine Query auszuwählen. Solche Abhängigkeiten sind problematisch: Das Abrufen der Daten unterbricht die Stream-Verarbeitung. Ohne eine Eventauswahl auf Grundlage externer Daten wird das Wachstum von Teilübereinstimmungen verstärkt. In dieser Dissertation stelle ich Strategien für optimiertes Zustandsmanagement von ESP Systemen vor. Zuerst ermögliche ich eine Best-Effort-Verarbeitung mittels Load Shedding. Dabei werden sowohl Eingabeeevents als auch Teilübereinstimmungen systematisch verworfen, um eine Latenzschwelle mit minimalem Qualitätsverlust zu garantieren. Zweitens integriere ich externe Daten, indem ich das Abrufen dieser von der Verwendung in der Queryverarbeitung entkoppele. Mit einem effizienten Caching-Mechanismus vermeide ich Unterbrechungen durch Übertragungslatenzen. Dazu werden externe Daten basierend auf ihrer erwarteten Verwendung vorab abgerufen und mittels Lazy Evaluation bei der Eventauswahl berücksichtigt. Dabei wird ein Kostenmodell verwendet, um zu bestimmen, wann welche externen Daten abgerufen und wie lange sie im Cache aufbewahrt werden sollen. Ich habe die Effektivität und Effizienz der vorgeschlagenen Strategien anhand von synthetischen und realen Daten ausgewertet und unter Beweis gestellt. / Event stream processing systems continuously evaluate queries over event streams to detect user-specified patterns with low latency. However, the challenge is that query processing is stateful and it maintains partial matches that grow exponentially in the size of processed events. State management is complicated by the dynamicity of streams and the need to integrate remote data. First, heterogeneous event sources yield dynamic streams with unpredictable input rates, data distributions, and query selectivities. During peak times, exhaustive processing is unreasonable, and systems shall resort to best-effort processing. Second, queries may require remote data to select a specific event for a pattern. Such dependencies are problematic: Fetching the remote data interrupts the stream processing. Yet, without event selection based on remote data, the growth of partial matches is amplified. In this dissertation, I present strategies for optimised state management in event pattern detection. First, I enable best-effort processing with load shedding that discards both input events and partial matches. I carefully select the shedding elements to satisfy a latency bound while striving for a minimal loss in result quality. Second, to efficiently integrate remote data, I decouple the fetching of remote data from its use in query evaluation by a caching mechanism. To this end, I hide the transmission latency by prefetching remote data based on anticipated use and by lazy evaluation that postpones the event selection based on remote data to avoid interruptions. A cost model is used to determine when to fetch which remote data items and how long to keep them in the cache. I evaluated the above techniques with queries over synthetic and real-world data. I show that the load shedding technique significantly improves the recall of pattern detection over baseline approaches, while the technique for remote data integration significantly reduces the pattern detection latency. Datenstromverarbeitung Complex event processing Mustererkennung Datenbankmanagementsystem Data stream processing Complex event processing Pattern detection Database management systems 004 Informatik ST 265 ddc:004
5	Role-based Data Management Jäkel, Tobias 29 May 2017 (has links) (PDF) Database systems build an integral component of today’s software systems and as such they are the central point for storing and sharing a software system’s data while ensuring global data consistency at the same time. Introducing the primitives of roles and their accompanied metatype distinction in modeling and programming languages, results in a novel paradigm of designing, extending, and programming modern software systems. In detail, roles as modeling concept enable a separation of concerns within an entity. Along with its rigid core, an entity may acquire various roles in different contexts during its lifetime and thus, adapts its behavior and structure dynamically during runtime. Unfortunately, database systems, as important component and global consistency provider of such systems, do not keep pace with this trend. The absence of a metatype distinction, in terms of an entity’s separation of concerns, in the database system results in various problems for the software system in general, for the application developers, and ﬁnally for the database system itself. In case of relational database systems, these problems are concentrated under the term role-relational impedance mismatch. In particular, the whole software system is designed by using different semantics on various layers. In case of role-based software systems in combination with relational database systems this gap in semantics between applications and the database system increases dramatically. Consequently, the database system cannot directly represent the richer semantics of roles as well as the accompanied consistency constraints. These constraints have to be ensured by the applications and the database system loses its single point of truth characteristic in the software system. As the applications are in charge of guaranteeing global consistency, their development requires more effort in data management. Moreover, the software system’s data management is distributed over several layers, which results in an unstructured software system architecture. To overcome the role-relational impedance mismatch and bring the database system back in its rightful position as single point of truth in a software system, this thesis introduces the novel and tripartite RSQL approach. It combines a novel database model that represents the metatype distinction as ﬁrst class citizen in a database system, an adapted query language on the database model’s basis, and ﬁnally a proper result representation. Precisely, RSQL’s logical database model introduces Dynamic Data Types, to directly represent the separation of concerns within an entity type on the schema level. On the instance level, the database model deﬁnes the notion of a Dynamic Tuple that combines an entity with the notion of roles and thus, allows for dynamic structure adaptations during runtime without changing an entity’s overall type. These deﬁnitions build the main data structures on which the database system operates. Moreover, formal operators connecting the query language statements with the database model data structures, complete the database model. The query language, as external database system interface, features an individual data deﬁnition, data manipulation, and data query language. Their statements directly represent the metatype distinction to address Dynamic Data Types and Dynamic Tuples, respectively. As a consequence of the novel data structures, the query processing of Dynamic Tuples is completely redesigned. As last piece for a complete database integration of a role-based notion and its accompanied metatype distinction, we specify the RSQL Result Net as result representation. It provides a novel result structure and features functionalities to navigate through query results. Finally, we evaluate all three RSQL components in comparison to a relational database system. This assessment clearly demonstrates the beneﬁts of the roles concept’s full database integration. Rollenkonzept Datenbankmanagementsystem DBMS RSQL Anfragesprache Datenbankmodell logische Operatorenen Role concept database management system DBMS RSQL query language database model compartment role object model logical database operators ddc:004 rvk:ST 270
6	Energy-Efficient In-Memory Database Computing Lehner, Wolfgang 27 June 2013 (has links) (PDF) The efficient and flexible management of large datasets is one of the core requirements of modern business applications. Having access to consistent and up-to-date information is the foundation for operational, tactical, and strategic decision making. Within the last few years, the database community sparked a large number of extremely innovative research projects to push the envelope in the context of modern database system architectures. In this paper, we outline requirements and influencing factors to identify some of the hot research topics in database management systems. We argue that—even after 30 years of active database research—the time is right to rethink some of the core architectural principles and come up with novel approaches to meet the requirements of the next decades in data management. The sheer number of diverse and novel (e.g., scientific) application areas, the existence of modern hardware capabilities, and the need of large data centers to become more energy-efficient will be the drivers for database research in the years to come. Datenbank Datenbanksystem DBS Datenbankmanagementsystem DBMS Sonderforschungsbereich 912 Hochadaptive Energieeffiziente Systeme database system database management systems Big Data Collaborative Research Centre 912 ddc:004 rvk:ST 270
7	Role-based Data Management Jäkel, Tobias 24 March 2017 (has links) Database systems build an integral component of today’s software systems and as such they are the central point for storing and sharing a software system’s data while ensuring global data consistency at the same time. Introducing the primitives of roles and their accompanied metatype distinction in modeling and programming languages, results in a novel paradigm of designing, extending, and programming modern software systems. In detail, roles as modeling concept enable a separation of concerns within an entity. Along with its rigid core, an entity may acquire various roles in different contexts during its lifetime and thus, adapts its behavior and structure dynamically during runtime. Unfortunately, database systems, as important component and global consistency provider of such systems, do not keep pace with this trend. The absence of a metatype distinction, in terms of an entity’s separation of concerns, in the database system results in various problems for the software system in general, for the application developers, and ﬁnally for the database system itself. In case of relational database systems, these problems are concentrated under the term role-relational impedance mismatch. In particular, the whole software system is designed by using different semantics on various layers. In case of role-based software systems in combination with relational database systems this gap in semantics between applications and the database system increases dramatically. Consequently, the database system cannot directly represent the richer semantics of roles as well as the accompanied consistency constraints. These constraints have to be ensured by the applications and the database system loses its single point of truth characteristic in the software system. As the applications are in charge of guaranteeing global consistency, their development requires more effort in data management. Moreover, the software system’s data management is distributed over several layers, which results in an unstructured software system architecture. To overcome the role-relational impedance mismatch and bring the database system back in its rightful position as single point of truth in a software system, this thesis introduces the novel and tripartite RSQL approach. It combines a novel database model that represents the metatype distinction as ﬁrst class citizen in a database system, an adapted query language on the database model’s basis, and ﬁnally a proper result representation. Precisely, RSQL’s logical database model introduces Dynamic Data Types, to directly represent the separation of concerns within an entity type on the schema level. On the instance level, the database model deﬁnes the notion of a Dynamic Tuple that combines an entity with the notion of roles and thus, allows for dynamic structure adaptations during runtime without changing an entity’s overall type. These deﬁnitions build the main data structures on which the database system operates. Moreover, formal operators connecting the query language statements with the database model data structures, complete the database model. The query language, as external database system interface, features an individual data deﬁnition, data manipulation, and data query language. Their statements directly represent the metatype distinction to address Dynamic Data Types and Dynamic Tuples, respectively. As a consequence of the novel data structures, the query processing of Dynamic Tuples is completely redesigned. As last piece for a complete database integration of a role-based notion and its accompanied metatype distinction, we specify the RSQL Result Net as result representation. It provides a novel result structure and features functionalities to navigate through query results. Finally, we evaluate all three RSQL components in comparison to a relational database system. This assessment clearly demonstrates the beneﬁts of the roles concept’s full database integration. info:eu-repo/classification/ddc/004 ddc:004
8	Energy-Efficient In-Memory Database Computing Lehner, Wolfgang January 2013 (has links) The efficient and flexible management of large datasets is one of the core requirements of modern business applications. Having access to consistent and up-to-date information is the foundation for operational, tactical, and strategic decision making. Within the last few years, the database community sparked a large number of extremely innovative research projects to push the envelope in the context of modern database system architectures. In this paper, we outline requirements and influencing factors to identify some of the hot research topics in database management systems. We argue that—even after 30 years of active database research—the time is right to rethink some of the core architectural principles and come up with novel approaches to meet the requirements of the next decades in data management. The sheer number of diverse and novel (e.g., scientific) application areas, the existence of modern hardware capabilities, and the need of large data centers to become more energy-efficient will be the drivers for database research in the years to come. info:eu-repo/classification/ddc/004 ddc:004
9	Resource Centered Store Heese, Ralf 04 January 2016 (has links) Mit dem Resource Description Framework (RDF) können Eigenschaften von und die Beziehungen zwischen Ressourcen maschinenverarbeitbar beschrieben werden. Dadurch werden diese Daten für Maschinen zugänglicher und können unter anderem automatisch Daten zu einer Ressource lokalisieren und verarbeiten, unterschiedliche Bedeutungen einer Zeichenkette erkennen und implizite Informationen ableiten. Das Datenmodell von RDF und der zugehörigen Anfragesprache SPARQL basiert auf gerichteten und beschrifteten Multigraphen. Forschungsergebnisse haben gezeigt, dass relationale DBMS zum Verwalten von RDF-Daten ungeeignet sind. Native basierende RDF-DBMS können Anfragen in kürzerer Zeit verarbeiten. Der Leistungsgewinn wird durch redundantes Speichern von Tripeln in mehreren B+-Bäumen erzielt. Jedoch sind Join-ähnliche Operationen zum Berechnen des Ergebnisses erforderlich, was bei größeren Anfragen zu Leistungseinbußen führt. In dieser Arbeit wird der Resource Centered Store (RCS) entwickelt, dessen Speichermodell RDF-inhärente Eigenschaften ausnutzt, um Anfragen ohne die Notwendigkeit redundanter Speicherung effizient beantworten zu können. Die grundlegende Idee des RCS-Speichermodells besteht im Gruppieren der Daten als sternförmigen Teilgraphen auf Datenbankseiten. Die verwendeten Prinzipien ähnelt denen in RDBMS und daher können deren Algorithmen zur Beantwortung von Anfragen wiederverwendet werden. Darüber hinaus werden Transformationsregeln und Heuristiken zum Optimieren von SPARQL-Anfragen zum Finden eines möglichst optimalen Ausführungsplans definiert. In diesem Kontext wurden auch graphmusterbasierte Indexe spezifiziert und deren Nutzen für die Verarbeitung von Anfragen untersucht. Das RCS-Speichermodell wurde prototypisch implementiert und im Vergleich zum nativen RDF-DBMS Jena TDB evaluiert. Die durchgeführten Experimenten zeigen, dass das System insbesondere für das Beantworten von Anfragen mit großen sternförmigen Teilmustern geeignet ist. / The Resource Description Framework (RDF) is the conceptual foundation for representing properties of real-world or virtual resources and describing the relationships between them. Standards based on RDF allow machines to access and process information automatically and locate additional data about resources. It also supports the discovery of relationships between concepts. The smallest information unit in RDF are triples which form a directed labeled multi-graph. The query language SPARQL is also based on a graph model which makes it difficult for relational DBMS to store and query RDF data efficiently. The most performant DBMS for managing and querying RDF data implement a RDF-specific storage model based on a set of B+ tree indexes. The key disadvantages of these systems are the increased usage of secondary storage in cause of redundantly stored triples as well as the necessity of expensive join operation to compute the solutions of a SPARQL query. In this work we develop and describe the Resource Centered Store which exploits RDF inherent characteristics to avoid the requirement for storing triples redundantly while improving the query performance of larger queries. In the RCS storage model triples are grouped by their first component (subject) and storing these star-shaped subgraphs on database pages -- similar to relational DBMS. As a result the RCS can benefit from principles and algorithms that have been developed in the context of relational databases. Additionally, we defined transformation rules and heuristics to optimize SPARQL queries and generate an efficient query execution plan. In this context we also defined graph pattern based indexes and investigated their benefits for computing the solutions of queries. We implemented the RCS storage model prototypically and compared it to the native RDF DBMS Jena TDB. Our experiments showed that our storage model is especially suited to speed up the query performance of large star-shaped graph pattern. Anfragebearbeitung Anfrageoptimierung SPARQL Native RDF-Datenbankmanagementsystem SPARQL Native RDF database management system Query processing Query optimization 004 Informatik 28 Informatik, Datenverarbeitung ST 250 ST 250 X70 ST 270 ddc:004

Search results