Global ETD Search

1	Conception d'un famework pour la relaxation des requêtes SPARQL / Design of a Framework for Cooperative Answering of SPARQL Query in RDF Database Fokou Pelap, Géraud 21 November 2016 (has links) Une ontologie (ou base de connaissances) est une représentation formelle de connaissances sous la forme d'entités et de faits sur ces entités. Ces dernières années de nombreuses ontologies ont été développées dans des contextes académiques et industriels. Elles sont généralement définies à l’aide du langage forme lRDF et interrogées avec le langage de requêtes SPARQL. Une connaissance partielle du contenu et de la structure d’une ontologie peut amener les utilisateurs à exécuter des requêtes qui retournent un résultat vide de réponses, considéré comme insatisfaisant. Parmi les techniques d’interrogation coopératives développées pour résoudre ce problème se trouve la technique de relaxation de requêtes. Elle consiste à affaiblir les conditions exprimées dans les requêtes pour retourner des résultats alternatifs à l'utilisateur. En étudiant les travaux existants sur la relaxation de requêtes SPARQL nous avons constaté qu’ils présentent plusieurs limitations :(1) ils ne permettent pas de définir précisément la relaxation à effectuer tout en offrant la possibilité de contrôler le processus de relaxation (2) ils n’identifient pas les causes réelles d'échec de la requête formulée par l'utilisateur et (3) ils n’intègrent pas d’outils interactifs pour mieux exploiter les techniques de relaxation proposées. Pour répondre à ces limitations, ce travail de thèse propose un framework pour la relaxation de requêtes SPARQL. Ce framework inclut un ensemble d'opérateurs de relaxation des requêtes SPARQL permettant de relaxer incrémentalement des parties précises de la requête utilisateur tout en contrôlant la pertinence des réponses alternatives retournées par rapport aux besoins exprimés par l’utilisateur dans sa requête. Notre framework propose également plusieurs algorithmes qui identifient les causes d’échec de la requête utilisateur et les requêtes qui réussissent (c'est-à-dire, qui ont des résultats) ayant un nombre maximal de conditions de la requête initialement exprimée. Ces informations permettent à l’utilisateur de mieux comprendre pourquoi sa requête échoue et d’exécuter des requêtes qui retournent des résultats alternatifs.Enfin, notre framework propose des stratégies de relaxation qui élargissent les conditions de la requête utilisateur en s’appuyant sur les causes d’échec de celle-ci. Ces stratégies permettent de réduire le temps d’exécution du processus de relaxation par rapport à l’approche classique, qui consiste à exécuter les requêtes relaxées, en fonction de leur similarité avec la requête utilisateur, jusqu’à l’obtention d’un nombre satisfaisant de résultats alternatifs. Les contributions proposées dans ce framework ont été implémentées et validées par des scénarios et expérimentations basés sur le banc d'essai LUBM. Ils montrent l’intérêt de nos contributions par rapport à l'état de l'art. / Ontology (or Knowledge base) is a formal representation of knowledge as entities and facts related to these entities. In the past years, several ontologies have been developed in academic and industrial contexts.They are generally defined with RDF language and querying with SPARQL language. A partial knowledge of instances and schema of ontology may lead user to execute queries that result in empty answers, considered as unsatisfactory. Among cooperative querying techniques which have been developed to solve the problem of empty answers, query relaxation technique is the well-known and used. It aims at weakening the conditions expressed in the original query to return alternative answers to the user. Existing work on relaxation of SPARQL queries we suffer from many drawbacks : (1) they do not allow defining in precise way the relaxation to perform with the ability to control the relaxation process (2) they do not identify the causes of failure of the request expressed by the user and (3) they do not include interactive tools to better exploit the relaxation techniques proposed. To address these limitations, this thesis proposes an advanced framework forquery relaxation SPARQL. First, this framework includes a set of relaxation operators dedicated to SPARQLqueries, to incrementally relax specific parts of the user request while controlling the relevance of the alternative responses returned w.r.t. to the user needs expressed in his request. Our framework also provides both several algorithms that identify the causes of failure of the user query and queries that are successful with a maximum number of conditions initially expressed in the failing request. This information allows the user to better understand why his request fails and execute queries that return non-empty alternative results. Finally,our framework offers intelligent relaxation strategies that rely on the causes of query failure. Such strategies reduce the execution time of the relaxation process compared to the traditional approach, which executes relaxed requests, based on their similarity to the user request, until a number of satisfactory alternative results is obtained. All contributions proposed in this framework were implemented and validated by experiments and scenarios based on the tests bench LUBM. They show the interest of our contributions w.r.t. the state of theart. Problème des réponses vides Top-K réponse Causes d'échec Empty answer problem Top-K Answers Failing causes
2	Why-Query Support in Graph Databases Vasilyeva, Elena 28 March 2017 (has links) (PDF) In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the property-graph model belong to this development branch and provide a new way for storing and processing data in the form of a graph with nodes representing some entities and edges describing connections between them. This consideration makes them suitable for keeping data without a rigid schema for use cases like social-network processing or data integration. In addition to a flexible storage, graph databases provide new querying possibilities in the form of path queries, detection of connected components, pattern matching, etc. However, the schema flexibility and graph queries come with additional costs. With limited knowledge about data and little experience in constructing the complex queries, users can create such ones, which deliver unexpected results. Forced to debug queries manually and overwhelmed by the amount of query constraints, users can get frustrated by using graph databases. What is really needed, is to improve usability of graph databases by providing debugging and explaining functionality for such situations. We have to assist users in the discovery of what were the reasons of unexpected results and what can be done in order to fix them. The unexpectedness of result sets can be expressed in terms of their size or content. In the first case, users have to solve the empty-answer, too-many-, or too-few-answers problems. In the second case, users care about the result content and miss some expected answers or wonder about presence of some unexpected ones. Considering the typical problems of receiving no or too many results by querying graph databases, in this thesis we focus on investigating the problems of the first group, whose solutions are usually represented by why-empty, why-so-few, and why-so-many queries. Our objective is to extend graph databases with debugging functionality in the form of why-queries for unexpected query results on the example of pattern matching queries, which are one of general graph-query types. We present a comprehensive analysis of existing debugging tools in the state-of-the-art research and identify their common properties. From them, we formulate the following features of why-queries, which we discuss in this thesis, namely: holistic support of different cardinality-based problems, explanation of unexpected results and query reformulation, comprehensive analysis of explanations, and non-intrusive user integration. To support different cardinality-based problems, we develop methods for explaining no, too few, and too many results. To cover different kinds of explanations, we present two types: subgraph- and modification-based explanations. The first type identifies the reasons of unexpectedness in terms of query subgraphs and delivers differential graphs as answers. The second one reformulates queries in such a way that they produce better results. Considering graph queries to be complex structures with multiple constraints, we investigate different ways of generating explanations starting from the most general one that considers only a query topology through coarse-grained rewriting up to fine-grained modification that allows fine changes of predicates and topology. To provide a comprehensive analysis of explanations, we propose to compare them on three levels including a syntactic description, a content, and a size of a result set. In order to deliver user-aware explanations, we discuss two models for non-intrusive user integration in the generation process. With the techniques proposed in this thesis, we are able to provide fundamentals for debugging of pattern-matching queries, which deliver no, too few, or too many results, in graph databases implementing the property-graph model. Graph Datenbanken Anfragebearbeitung Graph databases pattern matching empty-answer problem why-queries ddc:004 rvk:ST 265 rvk:ST 270
3	Why-Query Support in Graph Databases Vasilyeva, Elena 08 November 2016 (has links) In the last few decades, database management systems became powerful tools for storing large amount of data and executing complex queries over them. In addition to extended functionality, novel types of databases appear like triple stores, distributed databases, etc. Graph databases implementing the property-graph model belong to this development branch and provide a new way for storing and processing data in the form of a graph with nodes representing some entities and edges describing connections between them. This consideration makes them suitable for keeping data without a rigid schema for use cases like social-network processing or data integration. In addition to a flexible storage, graph databases provide new querying possibilities in the form of path queries, detection of connected components, pattern matching, etc. However, the schema flexibility and graph queries come with additional costs. With limited knowledge about data and little experience in constructing the complex queries, users can create such ones, which deliver unexpected results. Forced to debug queries manually and overwhelmed by the amount of query constraints, users can get frustrated by using graph databases. What is really needed, is to improve usability of graph databases by providing debugging and explaining functionality for such situations. We have to assist users in the discovery of what were the reasons of unexpected results and what can be done in order to fix them. The unexpectedness of result sets can be expressed in terms of their size or content. In the first case, users have to solve the empty-answer, too-many-, or too-few-answers problems. In the second case, users care about the result content and miss some expected answers or wonder about presence of some unexpected ones. Considering the typical problems of receiving no or too many results by querying graph databases, in this thesis we focus on investigating the problems of the first group, whose solutions are usually represented by why-empty, why-so-few, and why-so-many queries. Our objective is to extend graph databases with debugging functionality in the form of why-queries for unexpected query results on the example of pattern matching queries, which are one of general graph-query types. We present a comprehensive analysis of existing debugging tools in the state-of-the-art research and identify their common properties. From them, we formulate the following features of why-queries, which we discuss in this thesis, namely: holistic support of different cardinality-based problems, explanation of unexpected results and query reformulation, comprehensive analysis of explanations, and non-intrusive user integration. To support different cardinality-based problems, we develop methods for explaining no, too few, and too many results. To cover different kinds of explanations, we present two types: subgraph- and modification-based explanations. The first type identifies the reasons of unexpectedness in terms of query subgraphs and delivers differential graphs as answers. The second one reformulates queries in such a way that they produce better results. Considering graph queries to be complex structures with multiple constraints, we investigate different ways of generating explanations starting from the most general one that considers only a query topology through coarse-grained rewriting up to fine-grained modification that allows fine changes of predicates and topology. To provide a comprehensive analysis of explanations, we propose to compare them on three levels including a syntactic description, a content, and a size of a result set. In order to deliver user-aware explanations, we discuss two models for non-intrusive user integration in the generation process. With the techniques proposed in this thesis, we are able to provide fundamentals for debugging of pattern-matching queries, which deliver no, too few, or too many results, in graph databases implementing the property-graph model. info:eu-repo/classification/ddc/004 ddc:004 Graph Datenbanken, Anfragebearbeitung
4	Relaxation of Subgraph Queries Delivering Empty Results Vasilyeva, Elena, Thiele, Maik, Mocan, Adrian, Lehner, Wolfgang 16 September 2022 (has links) Graph databases with the property graph model are used in multiple domains including social networks, biology, and data integration. They provide schema-flexible storage for data of a different degree of a structure and support complex, expressive queries such as subgraph isomorphism queries. The exibility and expressiveness of graph databases make it difficult for the users to express queries correctly and can lead to unexpected query results, e.g. empty results. Therefore, we propose a relaxation approach for subgraph isomorphism queries that is able to automatically rewrite a graph query, such that the rewritten query is similar to the original query and returns a non-empty result set. In detail, we present relaxation operations applicable to a query, cardinality estimation heuristics, and strategies for prioritizing graph query elements to be relaxed. To determine the similarity between the original query and its relaxed variants, we propose a novel cardinality-based graph edit distance. The feasibility of our approach is shown by using real-world queries from the DBpedia query log. info:eu-repo/classification/ddc/004 ddc:004

1

Page generated in 0.0691 seconds