Spelling suggestions: "subject:"datenbankabfrage"" "subject:"datenbankanalyse""
1 |
Database-Inspired Reasoning Problems in Description Logics With Path ExpressionsBednarczyk, Bartosz 19 July 2024 (has links)
Formal ontologies are of significant importance in artificial intelligence, playing a central role in the Semantic Web, ontology-based information integration, or peer-to-peer data management. In such scenarios, an especially prominent role is played by description logics (DLs) – a robust family of logical formalisms used to describe ontologies and serving as the logical underpinning of contemporary standardised ontology languages. To put knowledge bases to full use as core part of intelligent information systems, much attention is being devoted to the area of ontology-based data-access, with conjunctive queries and their generalisations such as positive conjunctive two-way regular path queries being employed as a fundamental querying formalism. The most expressive exemplars of description logics feature advanced constructors for roles and path expressions. Among the most powerful knowledge representation formalisms on the verge of decidability, are the DLs from the Z family. For its most expressive proponent, ZOIQ (a.k.a. ALCHbSelf reg OIQ), featuring nominals (O), role inverses (I), and number restrictions (Q), querying is undecidable and even decidability of knowledge-base satisfiability is open, owing to the intricate interplay of the three mentioned features. Restricting the interaction of O, I, and Q however (or excluding one of the features altogether) leads to beneficial model-theoretic properties, which give rise to upper bounds of ExpTime for knowledge-base satisfiability and 2ExpTime for querying.
Aiming for better understanding of the “expressive power versus computational complexity” trade-off for the Z family of DLs, we provide a more fine-grained complexity analysis for the query entailment problem over ontologies. In the thesis we focus on tame fragments of ZOIQ, namely the fragments in which either one the three features from {I, O, Q} is dropped or the class of models is restricted to the so-called quasi-forests. We employ the query languages ranging from (unions of) conjunctive queries ((U)CQs) to positive two-way regular path queries (P2RPQs). We mostly follow the classical semantics of entailment, but we also provide several results in the “finite-model” scenario. The most important results of the thesis are summarised below.
1. We provide a complete classification of the complexity of the query entailment problem (for various query languages discussed above) for tamed fragments of ZOIQ under the classical semantics. This involves several new ingredients such as: (i) a uniform exponential-time algorithm based on Lutz’s spoiler technique for the entailment of unions of conjunctive queries for ALCHbreg, (ii) new lower bounds for (rooted and unrooted) conjunctive query entailment over ALCSelf ontologies, and (iii) a novel reduction from the entailment of P2RPQs to the satisfiability problem for tamed ZOIQ, yielding a uniform 2ExpTime upper bound for all the considered logics. As a preliminary step towards lifting the above results to the realm of data complexity, we establish that the satisfiability of tamed ZOIQ is NP-complete. 2. Under the finite model semantics, we focus on UCQs only. With the proviso that the finite satisfiability problem for ZIQ is ExpTime-complete, we also provide a complete picture of the complexity of the query entailment problems. The key insight is that ZOI and ZOQ are finitely controllable.
3. We conclude the thesis by investigating the decidability border of further extensions of the Z family of DLs. Our goal is to understand whether the class of regular languages present in path expressions in Z is maximal for guaranteeing decidability of the underlying logic. We provide a series of undecidability results involving simple, non-regular languages (a subclass of visibly pushdown languages).
Our proofs rely on well-established model- and graph-theoretic definitions. What is more, most of them generalise (in a uniform way) and solidify multiple results known by the DL community. Our proofs are also easily adjustable to freshly defined logics, without the need to reproduce nearly-identical proofs.
|
2 |
Sample synopses for approximate answering of group-by queriesLehner, Wolfgang, Rösch, Philipp 22 April 2022 (has links)
With the amount of data in current data warehouse databases growing steadily, random sampling is continuously gaining in importance. In particular, interactive analyses of large datasets can greatly benefit from the significantly shorter response times of approximate query processing. Typically, those analytical queries partition the data into groups and aggregate the values within the groups. Further, with the commonly used roll-up and drill-down operations a broad range of group-by queries is posed to the system, which makes the construction of highly-specialized synopses difficult.
In this paper, we propose a general-purpose sampling scheme that is biased in order to answer group-by queries with high accuracy. While existing techniques focus on the size of the group when computing its sample size, our technique is based on its standard deviation. The basic idea is that the more homogeneous a group is, the less representatives are required in order to give a good estimate. With an extensive set of experiments, we show that our approach reduces both the estimation error and the construction cost compared to existing techniques.
|
3 |
Conjunctive Queries with Inequalities Under UpdatesIdris, Muhammad, Ugarte, Martín, Vansummeren, Stijn, Voigt, Hannes, Lehner, Wolfgang 15 June 2022 (has links)
Modern application domains such as Composite Event Recognition (CER) and real-time Analytics require the ability to dynamically refresh query results under high update rates. Traditional approaches to this problem are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the space overhead of materialization). Both techniques have recently been shown suboptimal: instead of materializing results and subresults, one can maintain a data structure that supports efficient maintenance under updates and can quickly enumerate the full query output, as well as the changes produced under single updates. Unfortunately, these data structures have been developed only for aggregate-join queries composed of equi-joins, limiting their applicability in domains such as CER where temporal joins are commonplace. In this paper, we present a new approach for dynamically evaluating queries with multi-way θ-joins under updates that is effective in avoiding both materialization and recomputation of results, while supporting a wide range of applications. To do this we generalize Dynamic Yannakakis, an algorithm for dynamically processing acyclic equi-join queries. In tandem, and of independent interest, we generalize the notions of acyclicity and free-connexity to arbitrary θ-joins. We instantiate our framework to the case where θ-joins are only composed of equalities and inequalities (<, ≤, =, >, ≥) and experimentally compare this algorithm, called IEDyn, to state of the art CER systems as well as incremental view maintenance engines. IEDyn performs consistently better than the competitor systems with up to two orders of magnitude improvements in both time and memory consumption.
|
4 |
Model-based Integration of Past & Future in TimeTravelKhalefa, Mohamed E., Fischer, Ulrike, Pedersen, Torben Bach, Lehner, Wolfgang 10 January 2023 (has links)
We demonstrate TimeTravel, an efficient DBMS system for seamless integrated querying of past and (forecasted) future values of time series, allowing the user to view past and future values as one joint time series. This functionality is important for advanced application domain like energy. The main idea is to compactly represent time series as models. By using models, the TimeTravel system answers queries approximately on past and future data with error guarantees (absolute error and confidence) one order of magnitude faster than when accessing the time series directly. In addition, it efficiently supports exact historical queries by only accessing relevant portions of the time series. This is unlike existing approaches, which access the entire time series to exactly answer the query.
To realize this system, we propose a novel hierarchical model index structure. As real-world time series usually exhibits seasonal behavior, models in this index incorporate seasonality. To construct a hierarchical model index, the user specifies seasonality period, error guarantees levels, and a statistical forecast method. As time proceeds, the system incrementally updates the index and utilizes it to answer approximate and exact queries. TimeTravel is implemented into PostgreSQL, thus achieving complete user transparency at the query level. In the demo, we show the easy building of a hierarchical model index for a real-world time series and the effect of varying the error guarantees on the speed up of approximate and exact queries.
|
Page generated in 0.0553 seconds