Global ETD Search

1	Distributed XML Query Processing Kling, Patrick January 2012 (has links) While centralized query processing over collections of XML data stored at a single site is a well understood problem, centralized query evaluation techniques are inherently limited in their scalability when presented with large collections (or a single, large document) and heavy query workloads. In the context of relational query processing, similar scalability challenges have been overcome by partitioning data collections, distributing them across the sites of a distributed system, and then evaluating queries in a distributed fashion, usually in a way that ensures locality between (sub-)queries and their relevant data. This thesis presents a suite of query evaluation techniques for XML data that follow a similar approach to address the scalability problems encountered by XML query evaluation. Due to the significant differences in data and query models between relational and XML query processing, it is not possible to directly apply distributed query evaluation techniques designed for relational data to the XML scenario. Instead, new distributed query evaluation techniques need to be developed. Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query processing is proposed. Based on a data partitioning model that supports both horizontal and vertical fragmentation steps (or any combination of the two), XML collections are fragmented and distributed across the sites of a distributed system. Then, a suite of distributed query evaluation strategies is proposed. These query evaluation techniques ensure locality between each fragment of the collection and the parts of the query corresponding to the data in this fragment. Special attention is paid to scalability and query performance, which is achieved by ensuring a high degree of parallelism during distributed query evaluation and by avoiding access to irrelevant portions of the data. For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides several alternative approaches for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is necessary to predict and compare the expected performance of each of these alternatives. In this work, this is accomplished through a query optimization technique based on a distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is fragmented to the demands of the query workload evaluated over this collection. To evaluate the performance impact of the distributed query evaluation techniques proposed in this thesis, the techniques were implemented within a production-quality XML database system. Based on this implementation, a thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation techniques introduced here lead to significant improvements in query performance and scalability both when compared to centralized techniques and when compared to existing distributed query evaluation techniques. distributed query processing XML query processing Computer Science
2	Distributed XML Query Processing Kling, Patrick January 2012 (has links) While centralized query processing over collections of XML data stored at a single site is a well understood problem, centralized query evaluation techniques are inherently limited in their scalability when presented with large collections (or a single, large document) and heavy query workloads. In the context of relational query processing, similar scalability challenges have been overcome by partitioning data collections, distributing them across the sites of a distributed system, and then evaluating queries in a distributed fashion, usually in a way that ensures locality between (sub-)queries and their relevant data. This thesis presents a suite of query evaluation techniques for XML data that follow a similar approach to address the scalability problems encountered by XML query evaluation. Due to the significant differences in data and query models between relational and XML query processing, it is not possible to directly apply distributed query evaluation techniques designed for relational data to the XML scenario. Instead, new distributed query evaluation techniques need to be developed. Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query processing is proposed. Based on a data partitioning model that supports both horizontal and vertical fragmentation steps (or any combination of the two), XML collections are fragmented and distributed across the sites of a distributed system. Then, a suite of distributed query evaluation strategies is proposed. These query evaluation techniques ensure locality between each fragment of the collection and the parts of the query corresponding to the data in this fragment. Special attention is paid to scalability and query performance, which is achieved by ensuring a high degree of parallelism during distributed query evaluation and by avoiding access to irrelevant portions of the data. For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides several alternative approaches for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is necessary to predict and compare the expected performance of each of these alternatives. In this work, this is accomplished through a query optimization technique based on a distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is fragmented to the demands of the query workload evaluated over this collection. To evaluate the performance impact of the distributed query evaluation techniques proposed in this thesis, the techniques were implemented within a production-quality XML database system. Based on this implementation, a thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation techniques introduced here lead to significant improvements in query performance and scalability both when compared to centralized techniques and when compared to existing distributed query evaluation techniques. distributed query processing XML query processing Computer Science
3	Holistic Boolean Twig Pattern Matching for Efficient XML Query Processing Ding, Dabin 01 May 2014 (has links) Efficient twig pattern matching is essential to XML queries and other tree-based queries. Numerous so-called holistic algorithms have been proposed for efficiently processing the twig patterns in XML queries. However, a more general form of twig pattern, called Boolean-twig (or B-twig for short), which allows arbitrary combination of an arbitrary number of all the three logical connectives, AND, OR, and NOT, in a twig pattern, has not been adequately addressed. The theme of this study is on holistic (and efficient) B-twig pattern matching using region encoding and Dewey encoding schemes. We first adopt region encoding and propose a novel, direct approach called DBTwigMerge for holistic B-twig pattern matching, which although enjoys certain theoretical ``beauty'' and ``elegance'' but does not always outperform our prior approach, BTwigMerge. Based on the experience gained and in-depth investigation, we then come up with another new and more efficient approach, FBTwigMerge, which is proven to be the overall winner among all the holistic approaches using region encoding. In this study, we also studied the holistic B-twig pattern matching problem using Dewey encoding. The unique properties of Dewey encoding bring challenges and also benefits to this problem. By carefully addressing the challenges, this dissertation finally presents the first Dewey based holistic approach, called DeweyNOT, for efficiently solving the pattern matching problem with a subclass of B-twigs, i.e., twig queries involving arbitrary AND/NOT predicates. Extensive experimental studies have been conducted that demonstrate the viability and outstanding performance of the proposed approaches. B-twig holistic pattern matching twig query XML query processing
4	BINDING HASH TECHNIQUE FOR XML QUERY OPTIMIZATION BRANT, MICHAEL J. 20 July 2006 (has links) No description available. XML Query Processing XML Query Optimization Semi-structured data XPath

1

Page generated in 0.1005 seconds