Global ETD Search

31	GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Management Systems Paradies, Marcus, Lehner, Wolfgang, Bornhövd, Christof 25 August 2022 (has links) Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system. We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems. info:eu-repo/classification/ddc/004 ddc:004
32	GRATIN: Accelerating Graph Traversals in Main-Memory Column Stores Paradies, Marcus, Rudolf, Michael, Bornhövd, Christof, Lehner, Wolfgang 25 August 2022 (has links) Native graph query and processing capabilities have become indispensable for modern business applications in enterprise-critical operations on data that is stored in relational database management systems. Traversal operations are a basic ingredient of graph algorithms and graph queries. As a consequence, they are fundamental for querying graph data in a relational database management system. In this paper we present gratin, a concise secondary index structure to speedup graph traversals in main-memory column stores. Conventional approaches for graph traversals rely on repeated full column scans, making it an inefficient approach for deep traversals on very large graphs. To tackle this challenge, we devise a novel and adaptive block-based index to handle graphs efficiently. Most importantly, gratin is updateable in constant time and allows supporting evolving graphs with frequent updates to the graph topology. We conducted an extensive evaluation on real-world data sets from different domains for a large variety of traversal queries. Our experiments show improvements of up to an order of magnitude compared to a scan-based traversal algorithm. info:eu-repo/classification/ddc/004 ddc:004
33	A latency comparison in a sharded database environment : A study between Vitess-MySQL and CockroachDB Lundh, Filip, Mohlin, Mikael January 2022 (has links) The world is becoming more and more digitized which in turn puts pressure on existing applications and systems to be able to handle large quantities of data. And in some cases, that data also needs to be operated in secure and isolated environments. To address these needs, a new category of databases has emerged, by the name of NewSQL. The downside of this new category is that it still remains unexplored in some areas, such as how each database under that category performs towards each other, or even towards databases belonging to other categories. One major aspect, in terms of performance is latency, since it affects the overall user-experience. In order to clear up some of the unexplored areas within NewSQL, two databases were studied in the context of their latency performance: CockroachDB and Vitess. The study was divided into two main parts. The first one, was a quantitative study, which was about gathering data on how each database performed in terms of latency when serving the create, read, update, and delete-operations. No clear differences in latency were found for the create- and read-operations. While the results for update- and delete-operations showed significant differences where Vitess had lower latency than CockroachDB. The second part of this study was a qualitative study, dedicated to analyze and inspect each database architecture and source code. The intention was to identify potential factors that may affect latency performance. The outcome from the analysis was that three main factors could be identified. The first identified factor is that CockroachDB had a layered architecture and that it needed to translate SQL queries into a set of key-value operations. The second one is that the databases makes use of different storage engines, which in turn can have differences in performance. The third and final identified factor is that MySQL, which was integrated with Vitess, had existed for a longer period of time compared to CockroachDB. Which indicates that the database probably has been more optimized over the years. Architecture Analysis CockroachDB Database Management System Distributed SQL Latency Comparison MySQL NewSQL Sharding Source Code Inspection Vitess Computer and Information Sciences Data- och informationsvetenskap
34	Database Selection Process in Very Small Enterprises in Software Development : A Case Study examining Factors, Methods, and Properties Adolfsson, Teodor, Sundin, Axel January 2023 (has links) This thesis investigates the database model selection process in VSEs, looking into how priorities and needs differ compared to what is proposed by existing theory in the area. The study was conducted as a case study of a two-person company engaged in developing various applications and performing consulting tasks. Data was collected through two semi-structured interviews. The first interview aimed to understand the company's process for selecting a database model, while the second interview focused on obtaining their perspective on any differences in their selection process compared to the theoretical recommendations and suggested methodology. The purpose was to investigate the important factors involved in the process and explore why and how they deviated from what the theory proposes. The study concludes that VSEs have different priorities compared to larger enterprises. Factors like transaction amount does not have to be considered much at the scale of a VSE. It is more important to look into the total cost of the database solution, including making sure that the selected technology is sufficiently efficient to use in development and relatively easy to maintain. Regarding selection methodology it was concluded that the time investment required to decide what is the best available database solution can be better spent elsewhere in the enterprise, and finding a good enough solution to get the wheels of the ground is likely a more profitable aim. Database selection process Database model Database management system Technology selection Very small entities Very small enterprises Software development Information Systems
35	Data Management Support for Notification Services Lehner, Wolfgang 17 July 2023 (has links) Database management systems are highly specialized to efficiently organize and process huge amounts of data in a transactional manner. During the last years, however, database management systems have been evolving as a central hub for the integration of mostly heterogeneous and autonomous data sources to provide homogenized data access. The next step in pushing database technology forward to play the role of an information marketplace is to actively notify registered users about incoming messages or changes in the underlying data set. Therefore, notification services may be seen as a generic term for subscription systems or, more general, data stream systems which both enable processing of standing queries over transient data. This article gives a comprehensive introduction into the context of notification services by outlining their differences to the classical query/response-based communication pattern, it illustrates potential application areas, and it discusses requirements addressing the underlying data management support. In more depth, this article describes the core concepts of the PubScribe project thereby choosing three different perspectives. From a first perspective, the subscription process and its mapping onto the primitive publish/subscribe communication pattern is explained. The second part focuses on a hybrid subscription data model by describing the basic constructs from a structural as well as an operational point of view. Finally, the PubScribe notification service project is characterized by a storage and processing model based on relational database technology. To summarize, this contribution introduces the idea of notification services from an application point of view by inverting the database approach and dealing with persistent queries and transient data. Moreover, the article provides an insight into database technology, which must be exploited and adopted to provide a solid base for a scalable notification infrastructure, using the PubScribe project as an example. info:eu-repo/classification/ddc/004 ddc:004
36	Role-based Data Management Jäkel, Tobias 29 May 2017 (has links) (PDF) Database systems build an integral component of today’s software systems and as such they are the central point for storing and sharing a software system’s data while ensuring global data consistency at the same time. Introducing the primitives of roles and their accompanied metatype distinction in modeling and programming languages, results in a novel paradigm of designing, extending, and programming modern software systems. In detail, roles as modeling concept enable a separation of concerns within an entity. Along with its rigid core, an entity may acquire various roles in different contexts during its lifetime and thus, adapts its behavior and structure dynamically during runtime. Unfortunately, database systems, as important component and global consistency provider of such systems, do not keep pace with this trend. The absence of a metatype distinction, in terms of an entity’s separation of concerns, in the database system results in various problems for the software system in general, for the application developers, and ﬁnally for the database system itself. In case of relational database systems, these problems are concentrated under the term role-relational impedance mismatch. In particular, the whole software system is designed by using different semantics on various layers. In case of role-based software systems in combination with relational database systems this gap in semantics between applications and the database system increases dramatically. Consequently, the database system cannot directly represent the richer semantics of roles as well as the accompanied consistency constraints. These constraints have to be ensured by the applications and the database system loses its single point of truth characteristic in the software system. As the applications are in charge of guaranteeing global consistency, their development requires more effort in data management. Moreover, the software system’s data management is distributed over several layers, which results in an unstructured software system architecture. To overcome the role-relational impedance mismatch and bring the database system back in its rightful position as single point of truth in a software system, this thesis introduces the novel and tripartite RSQL approach. It combines a novel database model that represents the metatype distinction as ﬁrst class citizen in a database system, an adapted query language on the database model’s basis, and ﬁnally a proper result representation. Precisely, RSQL’s logical database model introduces Dynamic Data Types, to directly represent the separation of concerns within an entity type on the schema level. On the instance level, the database model deﬁnes the notion of a Dynamic Tuple that combines an entity with the notion of roles and thus, allows for dynamic structure adaptations during runtime without changing an entity’s overall type. These deﬁnitions build the main data structures on which the database system operates. Moreover, formal operators connecting the query language statements with the database model data structures, complete the database model. The query language, as external database system interface, features an individual data deﬁnition, data manipulation, and data query language. Their statements directly represent the metatype distinction to address Dynamic Data Types and Dynamic Tuples, respectively. As a consequence of the novel data structures, the query processing of Dynamic Tuples is completely redesigned. As last piece for a complete database integration of a role-based notion and its accompanied metatype distinction, we specify the RSQL Result Net as result representation. It provides a novel result structure and features functionalities to navigate through query results. Finally, we evaluate all three RSQL components in comparison to a relational database system. This assessment clearly demonstrates the beneﬁts of the roles concept’s full database integration. Rollenkonzept Datenbankmanagementsystem DBMS RSQL Anfragesprache Datenbankmodell logische Operatorenen Role concept database management system DBMS RSQL query language database model compartment role object model logical database operators ddc:004 rvk:ST 270
37	Automatic Reasoning Techniques for Non-Serializable Data-Intensive Applications Gowtham Kaki (7022108) 14 August 2019 (has links) <div> <div> <div> <p>The performance bottlenecks in modern data-intensive applications have induced database implementors to forsake high-level abstractions and trade-off simplicity and ease of reasoning for performance. Among the first casualties of this trade-off are the well-known ACID guarantees, which simplify the reasoning about concurrent database transactions. ACID semantics have become increasingly obsolete in practice due to serializable isolation – an integral aspect of ACID, being exorbitantly expensive. Databases, including the popular commercial offerings, default to weaker levels of isolation where effects of concurrent transactions are visible to each other. Such weak isolation guarantees, however, are extremely hard to reason about, and have led to serious safety violations in real applications. The problem is further complicated in a distributed setting with asynchronous state replications, where high availability and low latency requirements compel large-scale web applications to embrace weaker forms of consistency (e.g., eventual consistency) besides weak isolation. Given the serious practical implications of safety violations in data-intensive applications, there is a pressing need to extend the state-of-the-art in program verification to reach non- serializable data-intensive applications operating in a weakly-consistent distributed setting. </p> <p>This thesis sets out to do just that. It introduces new language abstractions, program logics, reasoning methods, and automated verification and synthesis techniques that collectively allow programmers to reason about non-serializable data-intensive applications in the same way as their serializable counterparts. The contributions </p> </div> </div> <div> <div> <p>xi </p> </div> </div> </div> <div> <div> <div> <p>made are broadly threefold. Firstly, the thesis introduces a uniform formal model to reason about weakly isolated (non-serializable) transactions on a sequentially consistent (SC) relational database machine. A reasoning method that relates the semantics of weak isolation to the semantics of the database program is presented, and an automation technique, implemented in a tool called ACIDifier is also described. The second contribution of this thesis is a relaxation of the machine model from sequential consistency to a specifiable level of weak consistency, and a generalization of the data model from relational to schema-less or key-value. A specification language to express weak consistency semantics at the machine level is described, and a bounded verification technique, implemented in a tool called Q9 is presented that bridges the gap between consistency specifications and program semantics, thus allowing high-level safety properties to be verified under arbitrary consistency levels. The final contribution of the thesis is a programming model inspired by version control systems that guarantees correct-by-construction <i>replicated data types</i> (RDTs) for building complex distributed applications with arbitrarily-structured replicated state. A technique based on decomposing inductively-defined data types into <i>characteristic relations</i> is presented, which is used to reason about the semantics of the data type under state replication, and eventually derive its correct-by-construction replicated variant automatically. An implementation of the programming model, called Quark, on top of a content-addressable storage is described, and the practicality of the programming model is demonstrated with help of various case studies. </p> </div> </div> </div> Distributed Computing Applied Discrete Mathematics Concurrent Programming Programming Languages Data Structures Database Management transactions Concurrency control Distributed Computing System verification method database management system Model Checking and Simulation Programming languages
38	Sistema de gerenciamento da informação: alterações neurológicas em chagásicos crônicos não-cardíacos / Information Management System: neurological disorders in non-cardiac chronics chagasic. Carmo, Samuel Sullivan 27 April 2010 (has links) O presente trabalho ocupa-se no desenvolvimento de um sistema computacional de gerenciamento da informação para auxiliar os estudos científicos sobre o sistema nervoso de chagásicos crônicos não-cardíacos. O objetivo é desenvolver o sistema requerido, pelo pressuposto de praticidade nas análises decorrentes da investigação. O método utilizado para desenvolver este sistema computacional, dedicado ao gerenciamento das informações da pesquisa sobre as alterações neurológicas de seus sujeitos, foi; compor o arquétipo de metas e a matriz de levantamento de requisitos das variantes do sistema; listar os atributos, domínios e qualificações das suas variáveis; elaborar o quadro de escolha de equipamentos e aplicativos necessários para sua implantação física e lógica e; implantá-lo mediante uma modelagem de base de dados, e uma programação lógica de algoritmos. Como resultado o sistema foi desenvolvido. A discussão de análise é; a saber, que a informatização pode tornar mais eficaz as operações de cadastro, consulta e validação de campo, além da formatação e exportação de tabelas pré-tratadas para análises estatísticas, atuando assim como uma ferramenta do método científico. Ora, a argumentação lógica é que a confiabilidade das informações computacionalmente registradas é aumentada porque o erro humano é diminuído na maioria dos processamentos. Como discussão de cerramento, estudos dotados de razoável volume de variáveis e sujeitos de pesquisa são mais bem geridos caso possuam um sistema dedicado ao gerenciamento de suas informações. / This is the development of a computer information management system to support scientific studies about the nervous system of non-cardiac chronic chagasic patients. The goal is to develop the required system, by assumption of the convenience in the analysis of research results. The method used to develop this computer system, dedicated to information management of research about the neurological disorders of their human subject research, were; compose the archetypal matrix of targets and requirements elicitation of the system variants; list the attributes, qualifications and domains of its variables; draw up the choice framework of equipment and required applications for its physical and logic implementation, and; deploying it through a data modeling, an adapted entity-relationship diagram and programmable logic algorithms. As a result the required system was developed. The analytical discussion is that the computerization makes the data processing faster and safer. The more practical information management processes are: the operations of registration, queries and fields\' validations, as well as the advanced and basic queries of records, in addition to table formatting and exporting of pre-treated for statistical analysis. The logical argument is that the reliability of the recorded computationally information is increased because is insured that bias of human error is absent from most of the steps, including several the data processing operations. As end discussion, scientific studies with reasonable amount of variables and research subjects are better managed if they have a dedicated system to managing their information. Base de Dados (BD) Chagas Disease (CD) Database (DB) Database Management System (DBMS) Doença de Chagas (DC) Nervous System (NS) Sistema Nervoso (SN)
39	Resource Centered Store Heese, Ralf 04 January 2016 (has links) Mit dem Resource Description Framework (RDF) können Eigenschaften von und die Beziehungen zwischen Ressourcen maschinenverarbeitbar beschrieben werden. Dadurch werden diese Daten für Maschinen zugänglicher und können unter anderem automatisch Daten zu einer Ressource lokalisieren und verarbeiten, unterschiedliche Bedeutungen einer Zeichenkette erkennen und implizite Informationen ableiten. Das Datenmodell von RDF und der zugehörigen Anfragesprache SPARQL basiert auf gerichteten und beschrifteten Multigraphen. Forschungsergebnisse haben gezeigt, dass relationale DBMS zum Verwalten von RDF-Daten ungeeignet sind. Native basierende RDF-DBMS können Anfragen in kürzerer Zeit verarbeiten. Der Leistungsgewinn wird durch redundantes Speichern von Tripeln in mehreren B+-Bäumen erzielt. Jedoch sind Join-ähnliche Operationen zum Berechnen des Ergebnisses erforderlich, was bei größeren Anfragen zu Leistungseinbußen führt. In dieser Arbeit wird der Resource Centered Store (RCS) entwickelt, dessen Speichermodell RDF-inhärente Eigenschaften ausnutzt, um Anfragen ohne die Notwendigkeit redundanter Speicherung effizient beantworten zu können. Die grundlegende Idee des RCS-Speichermodells besteht im Gruppieren der Daten als sternförmigen Teilgraphen auf Datenbankseiten. Die verwendeten Prinzipien ähnelt denen in RDBMS und daher können deren Algorithmen zur Beantwortung von Anfragen wiederverwendet werden. Darüber hinaus werden Transformationsregeln und Heuristiken zum Optimieren von SPARQL-Anfragen zum Finden eines möglichst optimalen Ausführungsplans definiert. In diesem Kontext wurden auch graphmusterbasierte Indexe spezifiziert und deren Nutzen für die Verarbeitung von Anfragen untersucht. Das RCS-Speichermodell wurde prototypisch implementiert und im Vergleich zum nativen RDF-DBMS Jena TDB evaluiert. Die durchgeführten Experimenten zeigen, dass das System insbesondere für das Beantworten von Anfragen mit großen sternförmigen Teilmustern geeignet ist. / The Resource Description Framework (RDF) is the conceptual foundation for representing properties of real-world or virtual resources and describing the relationships between them. Standards based on RDF allow machines to access and process information automatically and locate additional data about resources. It also supports the discovery of relationships between concepts. The smallest information unit in RDF are triples which form a directed labeled multi-graph. The query language SPARQL is also based on a graph model which makes it difficult for relational DBMS to store and query RDF data efficiently. The most performant DBMS for managing and querying RDF data implement a RDF-specific storage model based on a set of B+ tree indexes. The key disadvantages of these systems are the increased usage of secondary storage in cause of redundantly stored triples as well as the necessity of expensive join operation to compute the solutions of a SPARQL query. In this work we develop and describe the Resource Centered Store which exploits RDF inherent characteristics to avoid the requirement for storing triples redundantly while improving the query performance of larger queries. In the RCS storage model triples are grouped by their first component (subject) and storing these star-shaped subgraphs on database pages -- similar to relational DBMS. As a result the RCS can benefit from principles and algorithms that have been developed in the context of relational databases. Additionally, we defined transformation rules and heuristics to optimize SPARQL queries and generate an efficient query execution plan. In this context we also defined graph pattern based indexes and investigated their benefits for computing the solutions of queries. We implemented the RCS storage model prototypically and compared it to the native RDF DBMS Jena TDB. Our experiments showed that our storage model is especially suited to speed up the query performance of large star-shaped graph pattern. Anfragebearbeitung Anfrageoptimierung SPARQL Native RDF-Datenbankmanagementsystem SPARQL Native RDF database management system Query processing Query optimization 004 Informatik 28 Informatik, Datenverarbeitung ST 250 ST 250 X70 ST 270 ddc:004
40	Спецификација и валидација ограничења у XML моделу података / Specifikacija i validacija ograničenja u XML modelu podataka / Specification and Validation of Constraints in XML Data Model Vidaković Jovana 06 July 2015 (has links) <p>Циљ истраживања реализованих у овом раду, био је да се формално опишу типови ограничења у XML моделу података, по угледу на типове ограничења у релационом моделу података. У складу са постављеним циљем, урађена је класификација типова ограничења у XML моделу података, њихова формална спецификација и имплементација у репрезентативним XML СУБП.</p> / <p>Cilj istraživanja realizovanih u ovom radu, bio je da se formalno opišu tipovi ograničenja u XML modelu podataka, po ugledu na tipove ograničenja u relacionom modelu podataka. U skladu sa postavljenim ciljem, urađena je klasifikacija tipova ograničenja u XML modelu podataka, njihova formalna specifikacija i implementacija u reprezentativnim XML SUBP.</p> / <p>The goal of the research conducted in this thesis was to formally describe the types of the constraints in the XML data model, according to the types of the constraints in the relational data model. In accordance with the set goal, the types of the constraints in the XML data model were classified, formally specified, and implemented in the representative XML DBMS.</p>

Search results