Global ETD Search

1	Considering User Intention in Differential Graph Queries Vasilyeva, Elena, Thiele, Maik, Bornhövd, Christof, Lehner, Wolfgang 30 November 2020 (has links) Empty answers are a major problem by processing pattern matching queries in graph databases. Especially, there can be multiple reasons why a query failed. To support users in such situations, differential queries can be used that deliver missing parts of a graph query. Multiple heuristics are proposed for differential queries, which reduce the search space. Although they are successful in increasing the performance, they can discard query subgraphs relevant to a user. To address this issue, the authors extend the concept of differential queries and introduce top-k differential queries that calculate the ranking based on users’ preferences and significantly support the users’ understanding of query database management systems. A user assigns relevance weights to elements of a graph query that steer the search and are used for the ranking. In this paper the authors propose different strategies for selection of relevance weights and their propagation. As a result, the search is modelled along the most relevant paths. The authors evaluate their solution and both strategies on the DBpedia data graph. info:eu-repo/classification/ddc/330 ddc:330 info:eu-repo/classification/ddc/004 ddc:004
2	A Domain-Specific Language for Do-It-Yourself Analytical Mashups Eberius, Julian, Thiele, Maik, Lehner, Wolfgang 26 January 2023 (has links) The increasing amount and variety of data available in the web leads to new possibilities in end-user focused data analysis. While the classic data base technologies for data integration and analysis (ETL and BI) are too complex for the needs of end users, newer technologies like web mashups are not optimal for data analysis. To make productive use of the data available on the web, end users need easy ways to find, join and visualize it. We propose a domain specific language (DSL) for querying a repository of heterogeneous web data. In contrast to query languages such as SQL, this DSL describes the visualization of the queried data in addition to the selection, filtering and aggregation of the data. The resulting data mashup can be made interactive by leaving parts of the query variable. We also describe an abstraction layer above this DSL that uses a recommendation-driven natural language interface to reduce the difficulty of creating queries in this DSL. info:eu-repo/classification/ddc/004 ddc:004
3	Ontology approach for Building Lifecycle data management Karlapudi, Janakiram, Valluru, Prathap, Menzel, Karsten 13 December 2021 (has links) The Architecture, Engineering and Construction industry involves multiple disciplines and activities throughout the Building Lifecycle Stages (BLS). To enable collaboration amongst these disciplines iterative and coordinated exchange of information is required. This improves the design process over multiple BLS. Since the last decade, BIM is a well-known approach to achieve collaboration through semantic representation and exchange of domain data. Despite the improvement, there is a lack of efficient implementation and management of building lifecycle functionalities in existing BIM solutions, because of their fundamental heterogeneity, complexity and adaptability. This research focuses on these issues and addresses a clear perception through analysis of BLS from various standards and norms. The paper concentrates on the demonstration of efficient representation of various BLS through the ontological approach and their effective involvement in BIM data management. With the validation and evaluation through SPARQL queries, this paper presents an ontological framework for building lifecycle data management.:ABSTRACT INTRODUCTION & BACKGROUND RELATED RESEARCH WORK ONTOLOGY-BASED BLS DATA MANAGEMENT VALIDATION CONCLUSION ACKNOWLEDGEMENT REFERENCES info:eu-repo/classification/ddc/690 ddc:690
4	GRAPHITE: An Extensible Graph Traversal Framework for Relational Database Management Systems Paradies, Marcus, Lehner, Wolfgang, Bornhövd, Christof 25 August 2022 (has links) Graph traversals are a basic but fundamental ingredient for a variety of graph algorithms and graph-oriented queries. To achieve the best possible query performance, they need to be implemented at the core of a database management system that aims at storing, manipulating, and querying graph data. Increasingly, modern business applications demand native graph query and processing capabilities for enterprise-critical operations on data stored in relational database management systems. In this paper we propose an extensible graph traversal framework (GRAPHITE) as a central graph processing component on a common storage engine inside a relational database management system. We study the influence of the graph topology on the execution time of graph traversals and derive two traversal algorithm implementations specialized for different graph topologies and traversal queries. We conduct extensive experiments on GRAPHITE for a large variety of real-world graph data sets and input configurations. Our experiments show that the proposed traversal algorithms differ by up to two orders of magnitude for different input configurations and therefore demonstrate the need for a versatile framework to efficiently process graph traversals on a wide range of different graph topologies and types of queries. Finally, we highlight that the query performance of our traversal implementations is competitive with those of two native graph database management systems. info:eu-repo/classification/ddc/004 ddc:004
5	Query optimization by using derivability in a data warehouse environment Albrecht, Jens, Hümmer, Wolfgang, Lehner, Wolfgang, Schlesinger, Lutz 10 January 2023 (has links) Materialized summary tables and cached query results are frequently used for the optimization of aggregate queries in a data warehouse. Query rewriting techniques are incorporated into database systems to use those materialized views and thus avoid the access of the possibly huge raw data. A rewriting is only possible if the query is derivable from these views. Several approaches can be found in the literature to check the derivability and find query rewritings. The specific application scenario of a data warehouse with its multidimensional perspective allows the consideration of much more semantic information, e.g. structural dependencies within the dimension hierarchies and different characteristics of measures. The motivation of this article is to use this information to present conditions for derivability in a large number of relevant cases which go beyond previous approaches. info:eu-repo/classification/ddc/004 ddc:004
6	DrillBeyond: Processing Multi-Result Open World SQL Queries Eberius, Julian, Thiele, Maik, Braunschweig, Katrin, Lehner, Wolfgang 11 July 2022 (has links) In a traditional relational database management system, queries can only be defined over attributes defined in the schema, but are guaranteed to give single, definitive answer structured exactly as specified in the query. In contrast, an information retrieval system allows the user to pose queries without knowledge of a schema, but the result will be a top-k list of possible answers, with no guarantees about the structure or content of the retrieved documents. In this paper, we present DrillBeyond, a novel IR/RDBMS hybrid system, in which the user seamlessly queries a relational database together with a large corpus of tables extracted from a web crawl. The system allows full SQL queries over the relational database, but additionally allows the user to use arbitrary additional attributes in the query that need not to be defined in the schema. The system then processes this semi-specified query by computing a top-k list of possible query evaluations, each based on different candidate web data sources, thus mixing properties of RDBMS and IR systems. We design a novel plan operator that encapsulates a web data retrieval and matching system and allows direct integration of such systems into relational query processing. We then present methods for efficiently processing multiple variants of a query, by producing plans that are optimized for large invariant intermediate results that can be reused between multiple query evaluations. We demonstrate the viability of the operator and our optimization strategies by implementing them in PostgreSQL and evaluating on a standard benchmark by adding arbitrary attributes to its queries. info:eu-repo/classification/ddc/004 ddc:004
7	Merging OLTP and OLAP: Back to the Future Lehner, Wolfgang 13 January 2023 (has links) When the terms “Data Warehousing” and “Online Analytical Processing” were coined in the 1990s by Kimball, Codd, and others, there was an obvious need for separating data and workload for operational transactional-style processing and decision-making implying complex analytical queries over large and historic data sets. Large data warehouse infrastructures have been set up to cope with the special requirements of analytical query answering for multiple reasons: For example, analytical thinking heavily relies on predefined navigation paths to guide the user through the data set and to provide different views on different aggregation levels.Multi-dimensional queries exploiting hierarchically structured dimensions lead to complex star queries at a relational backend, which could hardly be handled by classical relational systems. [Off: Introduction] info:eu-repo/classification/ddc/004 ddc:004
8	Top-k Entity Augmentation using Consistent Set Covering Eberius, Julian, Thiele, Maik, Braunschweig, Katrin, Lehner, Wolfgang 19 September 2022 (has links) Entity augmentation is a query type in which, given a set of entities and a large corpus of possible data sources, the values of a missing attribute are to be retrieved. State of the art methods return a single result that, to cover all queried entities, is fused from a potentially large set of data sources. We argue that queries on large corpora of heterogeneous sources using information retrieval and automatic schema matching methods can not easily return a single result that the user can trust, especially if the result is composed from a large number of sources that user has to verify manually. We therefore propose to process these queries in a Top-k fashion, in which the system produces multiple minimal consistent solutions from which the user can choose to resolve the uncertainty of the data sources and methods used. In this paper, we introduce and formalize the problem of consistent, multi-solution set covering, and present algorithms based on a greedy and a genetic optimization approach. We then apply these algorithms to Web table-based entity augmentation. The publication further includes a Web table corpus with 100M tables, and a Web table retrieval and matching system in which these algorithms are implemented. Our experiments show that the consistency and minimality of the augmentation results can be improved using our set covering approach, without loss of precision or coverage and while producing multiple alternative query results. info:eu-repo/classification/ddc/004 ddc:004

Search results