Spelling suggestions: "subject:"query rewriting"" "subject:"query rewritings""
1 |
Tractable query answering for description logics via query rewritingPerez-Urbina, Hector M. January 2010 (has links)
We consider the problem of answering conjunctive queries over description logic knowledge bases via query rewriting. Given a conjunctive query Q and a TBox T, we compute a new query Q′ that incorporates the semantic consequences of T such that, for any ABox A, evaluating Q over T and A can be done by evaluating the new query Q′ over A alone. We present RQR—a novel resolution-based rewriting algorithm for the description logic ELHIO¬ that generalizes and extends existing approaches. RQR not only handles a spectrum of logics ranging from DL-Lite_core up to ELHIO¬, but it is worst-case optimal with respect to data complexity for all of these logics; moreover, given the form of the rewritten queries, their evaluation can be delegated to off-the-shelf (deductive) database systems. We use RQR to derive the novel complexity results that conjunctive query answering for ELHIO¬ and DL-Lite+ are, respectively, PTime and NLogSpace complete with respect to data complexity. In order to show the practicality of our approach, we present the results of an empirical evaluation. Our evaluation suggests that RQR, enhanced with various straightforward optimizations, can be successfully used in conjunction with a (deductive) database system in order to answer queries over knowledge bases in practice. Moreover, in spite of being a more general procedure, RQR will often produce significantly smaller rewritings than the standard query rewriting algorithm for the DL-Lite family of logics.
|
2 |
Mise en oeuvre de politiques de protection de données à caractère personnel : ine approche reposant sur la réécriture de requêtes SPARQLOulmakhzoune, Said 29 April 2013 (has links) (PDF)
With the constant proliferation of information systems around the globe, the need for decentralized and scalable data sharing mechanisms has become a major factor of integration in a wide range of applications. Literature on information integration across autonomous entities has tacitly assumed that the data of each party can be revealed and shared to other parties. A lot of research, concerning the management of heterogeneous sources and database integration, has been proposed, for example based on centralized or distributed mediators that control access to data managed by different parties. On the other hand, real life data sharing scenarios in many application domains like healthcare, e-commerce market, e-government show that data integration and sharing are often hampered by legitimate and widespread data privacy and security concerns. Thus, protecting the individual data may be a prerequisite for organizations to share their data in open environments such as Internet. Work undertaken in this thesis aims to ensure security and privacy requirements of software systems, which take the form of web services, using query rewriting principles. The user query (SPARQL query) is rewritten in such a way that only authorized data are returned with respect to some confidentiality and privacy preferences policy. Moreover, the rewriting algorithm is instrumented by an access control model (OrBAC) for confidentiality constraints and a privacy-aware model (PrivOrBAC) for privacy constraints. A secure and privacy-preserving execution model for data services is then defined. Our model exploits the services¿ semantics to allow service providers to enforce locally their privacy and security policies without changing the implementation of their data services i.e., data services are considered as black boxes. We integrate our model to the architecture of Axis 2.0 and evaluate its efficiency in the healthcare application domain.
|
3 |
Web Service Composition and Selection Using Query Rewriting and Bayesian NetworkHsieh, I-Hsuan 24 July 2012 (has links)
Web services can be broadly classified into two types, namely effect providing (EP) services and data providing (DP) services. In this work, we address DP service composition problem that intends to satisfy user preference specified at the instance level, namely the expected occurrence. We first use the query rewriting method to identify a composition of service types that satisfies user¡¦s requirement and employ Bayesian Network model to express the causal relationship between exchange variables of DP service types. Service selection is then conducted by computing the posterior probability in the Bayesian Network. We conduct experiments to show that our proposed Bayesian Network-based method outperforms the other baseline methods in terms of execution success rate and data quality. It also has reasonable execution time.
|
4 |
Indexing and querying dataspacesMergen, Sérgio Luis Sardi January 2011 (has links)
Over theWeb, distributed and heterogeneous sources with structured and related content form rich repositories of information commonly referred to as dataspaces. To provide access to this heterogeneous data, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the source schemas. On dataspaces, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. We propose a novel querying architecture that requires neither a mediated schema nor source mappings, which is based mainly on indexing mechanisms and on-the-fly rewriting algorithms. Our indexes are designed for data that is represented as relations, and are able to capture the structure of the sources, their instances and the connections between them. In the absence of a mediated schema, the user formulates structured queries based on what she expects to find. These queries are rewritten using a best-effort approach: the proposed rewriting algorithms compare a user query against the source schemas and produces a set of rewritings based on the matches found. Based on this architecture, two different querying approaches are tested. Experiments show that the indexing and rewriting algorithms are scalable, i.e., able to handle a very large number of structured Web sources; and that support simple, yet expressive queries that exploit the inherent structure of the data.
|
5 |
Indexing and querying dataspacesMergen, Sérgio Luis Sardi January 2011 (has links)
Over theWeb, distributed and heterogeneous sources with structured and related content form rich repositories of information commonly referred to as dataspaces. To provide access to this heterogeneous data, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the source schemas. On dataspaces, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. We propose a novel querying architecture that requires neither a mediated schema nor source mappings, which is based mainly on indexing mechanisms and on-the-fly rewriting algorithms. Our indexes are designed for data that is represented as relations, and are able to capture the structure of the sources, their instances and the connections between them. In the absence of a mediated schema, the user formulates structured queries based on what she expects to find. These queries are rewritten using a best-effort approach: the proposed rewriting algorithms compare a user query against the source schemas and produces a set of rewritings based on the matches found. Based on this architecture, two different querying approaches are tested. Experiments show that the indexing and rewriting algorithms are scalable, i.e., able to handle a very large number of structured Web sources; and that support simple, yet expressive queries that exploit the inherent structure of the data.
|
6 |
Indexing and querying dataspacesMergen, Sérgio Luis Sardi January 2011 (has links)
Over theWeb, distributed and heterogeneous sources with structured and related content form rich repositories of information commonly referred to as dataspaces. To provide access to this heterogeneous data, information integration systems have traditionally relied on the availability of a mediated schema, along with mappings between this schema and the schema of the source schemas. On dataspaces, where sources are plentiful, autonomous and extremely volatile, a system based on the existence of a pre-defined mediated schema and mapping information presents several drawbacks. Notably, the cost of keeping the mappings up to date as new sources are found or existing sources change can be prohibitively high. We propose a novel querying architecture that requires neither a mediated schema nor source mappings, which is based mainly on indexing mechanisms and on-the-fly rewriting algorithms. Our indexes are designed for data that is represented as relations, and are able to capture the structure of the sources, their instances and the connections between them. In the absence of a mediated schema, the user formulates structured queries based on what she expects to find. These queries are rewritten using a best-effort approach: the proposed rewriting algorithms compare a user query against the source schemas and produces a set of rewritings based on the matches found. Based on this architecture, two different querying approaches are tested. Experiments show that the indexing and rewriting algorithms are scalable, i.e., able to handle a very large number of structured Web sources; and that support simple, yet expressive queries that exploit the inherent structure of the data.
|
7 |
Réécriture de requêtes avec des vues : une perspective théorique et pratique / Query rewriting using views : a theoretical and practical perspectiveIleana, Ioana 24 October 2014 (has links)
Dans ce document, nous adressons le problème de la réécriture de requêtes avec des vues, en adoptant une perspective à la fois théorique et pratique. Dans le premier et principal chapitre, nous approchons le sujet de la recherche de toutes les reformulations minimales (sans atomes relationnels redondants) pour une requête relationnelle conjonctive, sous des contraintes d’intégrité qui incluent la relation entre les schémas source et cible. Nous présentons un nouvel algorithme, correct et complet, le Provenance-Aware Chase & Backchase, qui résout le problème des reformulations avec des performances significatives sur le plan pratique. Nous présentons sa caractérisation théorique détaillée, son implémentation optimisée et son évaluation, montrant des gains de performance jusqu’à deux ordres de grandeur par rapport à un SGBD commercial. Nous généralisons notre algorithme pour trouver directement des reformulations de coût minimum pour les fonctions de coût monotones, et montrons les gains de performance de cette adaptation. Avec notre algorithme, nous introduisons également un nouveau type de chase, la Provenance-Aware Chase, qui comporte son propre intérêt théorique, en tant que moyen de raisonnement sur l’interaction entre la provenance et les contraintes. Dans le deuxième chapitre, nous nous plaçons dans un contexte XML et nous revisitons le travail de Cautis, Deutsch and Onose sur problème de la réécriture de requêtes XPath par un seul niveau d’intersection de plusieurs vues. Nous étendons l’analyse de ce probleme en montrant ses connexions avec les problèmes de l’équivalence DAG-arbre et de la union-freeness d’un DAG. Nous raffinons un algorithme de réécriture proposé par Cautis, Deutsch and Onose pour obtenir une complexité polynomiale et améliorer sa complétude, et présentons un ensemble d’optimisations des procedures de réécriture, necessaires pour atteindre des performances pratiques. Nous fournissons une implementation complète comprenant ces optimizations ainsi que son evaluation experimentale extensive, montrant la performance et l’utilité de la technique polynomiale de réécriture. / In this work, we address the problem of query rewriting using views, by adopting both a theoretical and a pragmatic perspective. In the first and main chapter, we approach the topic of finding all minimal (i.e. with no redundant relational atoms) conjunctive query reformulations for a relational conjunctive query, under constraints expressed as embedded dependencies, including the relationship between the source and the target schemas. We present a novel sound and complete algorithm, the Provenance-Aware Chase & Backchase, that solves the minimal reformulations problem with practically relevant performance. We provide a detailed theoretical characterization of our algorithm. We further present the optimized implementation and the experimental evaluation thereof, and exhibit natural scenarios yielding speed-ups of up to two orders of magnitude between the execution of a best view-based rewriting found by a commercial DBMS and that of a best rewriting found by our algorithm. We generalize the Provenance-Aware Chase & Backchase towards directly finding minimum-cost reformulations for monotonic cost functions, and show the performance improvements this adaptation further enables. With our algorithm, we introduce a novel chase flavour, the Provenance-Aware Chase, which is interesting on its own, as a means of reasoning about the interaction between provenance and constraints. In the second chapter, we move to an XML context and revisit the previous work of Cautis, Deutsch and Onose on the problem of finding XPath query rewritings with a single level of intersection of multiple views. We enrich the analysis of the rewriting problem by showing its links to the problems of DAG-tree equivalence and union-freeness. We refine the rule-based rewriting technique proposed by Cautis, Deutsch and Onose to ensure its polynomial complexity and improve its completeness, and present a range of optimizations on the rewriting procedures, necessary to achieve practical performance. We provide a complete implementation comprising these optimizations and a thorough experimental evaluation thereof, showing the performanceand utility of the polynomial rewriting technique.
|
8 |
Semantic Caching for XML QueriesChen, Li 29 January 2004 (has links)
With the advent of XML, great challenges arise from the demand for efficiently retrieving information from remote XML sources across the Internet. The semantic caching technology can help to improve the efficiency of XML query processing in the Web environment. Different from the traditional tuple or page-based caching systems, semantic caching systems exploit the idea of reusing cached query results to answer new queries based on the query containment and rewriting techniques. Fundamental results on the containment of relational queries have been established. In the XML setting, the containment problem remains unexplored for comprehensive XML query languages such as XQuery, and little has been studied with respect to the cache management issue such as replacement. Hence, this dissertation addresses two issues fundamental to building an XQuery-based semantic caching system: XQuery containment and rewriting, and an effective replacement strategy. We first define a restricted XQuery fragment for which the containment problem is tackled. For two given queries $Q1$ and $Q2$, a preprocessing step including variable minimization and query normalization is taken to transform them into a normal form. Then two tree structures are constructed for respectively representing the pattern matching and result construction components of the query semantics. Based on the tree structures, query containment is reduced to tree homomorphism, with some specific mapping conditions. Important notations and theorems are also presented to support our XQuery containment and rewriting approaches. For the cache replacement, we propose a fine-grained replacement strategy based on the detailed user access statistics recorded on the internal XML view structure. As a result, less frequently used XML view fragments are replaced to achieve a better utilization of the cache space. Finally, we has implemented a semantic caching system called ACE-XQ to realize the proposed techniques. Case studies are conducted to confirm the correctness of our XQuery containment and rewriting approaches by comparing the query results produced by utilizing ACE-XQ against those by the remote XQuery engine. Experimental studies show that the query performance is significantly improved by adopting ACE-XQ, and that our partial replacement helps to enhance the cache hits and utilization comparing to the traditional total replacement.
|
9 |
L'interrogation du web de données garantissant des réponses valides par rapport à des critères donnés / Querying the Web of Data guaranteeing valid answers with respect to given criteriaNguyen, Thanh Binh 03 December 2018 (has links)
Le terme Linked Open Data (LOD) (ou données ouvertes liées) a été introduit pour la première fois par Tim Berners-Lee en 2006. Depuis, les LOD ont connu une importante évolution. Aujourd’hui,nous pouvons constater les milliers de jeux de données présents sur le Web de données. De ce fait, la communauté de recherche s’est confrontée à un certain nombre de défis concernant la récupération et le traitement de données liées.Dans cette thèse, nous nous intéressons au problème de la qualité des données extraites de diverses sources du LOD et nous proposons un système d’interrogation contextuelle qui garantit la qualité des réponses par rapport à un contexte spécifié par l’utilisateur. Nous définissons un cadre d’expression de contraintes et proposons deux approches : l’une naïve et l’autre de réécriture, permettant de filtrer dynamiquement les réponses valides obtenues à partir des sources éventuellement non-valides, ceci au moment de la requête et non pas en cherchant à les valider dans les sources des données. L’approche naïve exécute le processus de validation en générant et en évaluant des sous-requêtes pour chaque réponse candidate en fonction de chaque contrainte. Alors que l’approche de réécriture utilise les contraintes comme des règles de réécriture pour reformuler la requête en un ensemble de requêtes auxiliaires, de sorte que les réponses à ces requêtes réécrites ne sont pas seulement les réponses de la requête initiale mais aussi des réponses valides par rapport à toutes les contraintes intégrées. La preuve de la correction et de la complétude de notre système de réécriture est présentée après un travail de formalisation de la notion de réponse valide par rapport à un contexte. Ces deux approches ont été évaluées et ont montré la praticabilité de notre système.Ceci est notre principale contribution: nous étendons l’ensemble de systèmes de réécriture déjà connus(Chase, C&BC, PerfectRef, Xrewrite, etc.) avec une nouvelle solution efficace pour ce nouveau défi qu’est le filtrage des résultats en fonction d’un contexte utilisateur. Nous généralisons également les conditions de déclenchement de contraintes par rapport aux solutions existantes, en utilisant la notion de one-way MGU. / The term Linked Open Data (LOD) is proposed the first time by Tim Berners-Lee since 2006.Since then, LOD has evolved impressively with thousands datasets on the Web of Data, which has raised a number of challenges for the research community to retrieve and to process LOD.In this thesis, we focus on the problem of quality of retrieved data from various sources of the LOD and we propose a context-driven querying system that guarantees the quality of answers with respect to the quality context defined by users. We define a fragment of constraints and propose two approaches: the naive and the rewriting, which allows us to filter dynamically valid answers at the query time instead of validating them at the data source level. The naive approach performs the validation process by generating and evaluating sub-queries for each candidate answer w.r.t. each constraint. While the rewriting approach uses constraints as rewriting rules to reformulate query into a set of auxiliary queries such that the answers of rewritten-queries are not only the answers of the query but also valid answers w.r.t. all integrated constraints. The proof of the correction and completeness of our rewriting system is presented after formalizing the notion of a valid answers w.r.t. a context. These two approaches have been evaluated and have shown the feasibility of our system.This is our main contribution: we extend the set of well-known query-rewriting systems (Chase, Chase& backchase, PerfectRef, Xrewrite, etc.) with a new effective solution for the new purpose of filtering query results based on constraints in user context. Moreover, we also enlarge the trigger condition of the constraint compared with other works by using the notion of one-way MGU.
|
10 |
Inférence de règles de contrôle d'accès pour assurer la confidentialité des données au niveau des vues matérialisées / Access control rules for materialized views : an inference-based approachNait Bahloul, Sarah 05 December 2013 (has links)
Dans cette thèse, nous nous intéressons au problème de la confidentialité des données. Nous proposons une nouvelle approche pour faciliter l'administration des règles de contrôle d'accès pour assurer la confidentialité des données au niveau des vues matérialisées. Dans les bases de données relationnelles, une vue est une table virtuelle représentant le résultat d'une requête. À la différence d'une vue simple, une vue matérialisée stocke le résultat de la requête dans une table. Cette dernière peut être alors interrogée comme une table quelconque. Il est donc important d'y contrôler l'accès. Parmi les différents modèles proposés pour contrôler l'accès aux relations de base, nous nous basons dans notre approche sur l'utilisation des vues d'autorisations pour exprimer des règles de contrôle d'accès à grains fins. Nous proposons d'inférer, à partir des vues d'autorisations attachées aux tables de base, les vues d'autorisations qui doivent être attachées aux vues matérialisées. Répondre à ce problème revient à répondre à un problème fondamental dans les bases de données relationnelles : Comment caractériser les informations calculables à partir de deux ensembles de vues ? Nous répondons à cette question en nous appuyant sur la réécriture de requêtes. Nous adaptons l'algorithme de réécriture de requêtes M iniCon aux spécificités de notre problème et nous proposons l'algorithme 1-l M iniCon+ qui se base sur un enchainement de réécritures. Nous nous intéressons aux vues représentées par des requêtes conjonctives en autorisant les égalités. Nous nous sommes intéressés par la suite aux propriétés de cet algorithme. Nous démontrons que cet algorithme permet de calculer un ensemble de vues correctes, c.-à-d. toute information calculable à partir de l'ensemble de vues générées est cal culable à partir de chacun des deux ensembles de vues de départ / In this thesis, we address the problem of data confidentiality. We propose a new approach to facilitate the administration of access control policies to ensure confidentiality of data in materialized views. In relational databases, a view is a virtual table representing the result of a query. Unlike a simple view, a materialized view persistently stores the data in a table. The latter can be queried like any other database table. We then need to control the access to the materialized view. Among the various models proposed for controlling access to base relations, we choose to express fine-grained access control through authorization views. We propose to infer, from the basic authorization views attached to the base tables, authorization views that will be attached to the materialized views. Tackling this problem amounts to address a fundamental problem in relational databases : How to characterize computable information from two sets of views ? We handle this problem by resorting to query rewriting. We adapt the query rewriting algorithm MiniCon to the context of materialized views with access control and propose the H MiniCon+ algorithm which is based on successive rewritings. We mainly consider conjunctive queries with equalities. We study the properties of our approach. We show that our algorithm can calculate a correct set of views, i.e. any computable information from the generated views is calculable from the two sets of views. In order to prove the termination of our algorithm, we define rewriting trees generated by the application of 1-l MiniCon+ and we study their features. We characterize in which case a tree is finite and show that the approach is maximal, i.e., any derivable information from the two sets of views can be derived from the set of generated views. We characterize in which case the algorithm could not terminate i.e., infinite application of the query rewriting algorithm. In this case, it is impossible to determine the maximality of results and this remains an open problem. We implemented a prototype of the approach and we led some experiments by using synthetic data sets
|
Page generated in 0.0821 seconds