Spelling suggestions: "subject:"XML query aprocessing"" "subject:"XML query eprocessing""
1 |
Distributed XML Query ProcessingKling, Patrick January 2012 (has links)
While centralized query processing over collections of XML data stored at a single site is a well understood problem,
centralized query evaluation techniques are inherently limited in their scalability when presented
with large collections (or a single, large document) and heavy query workloads.
In the context of relational query processing,
similar scalability challenges have been overcome by partitioning data collections,
distributing them across the sites of a distributed system, and then
evaluating queries in a distributed fashion, usually in a way that ensures locality between
(sub-)queries and their relevant data.
This thesis presents a suite of query evaluation techniques for XML data that follow a similar
approach to address the scalability problems encountered by XML query evaluation.
Due to the significant differences in data and query models between relational and XML query
processing, it is not possible to directly apply distributed query evaluation techniques designed
for relational data to the XML scenario.
Instead, new distributed query evaluation
techniques need to be developed.
Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query
processing is proposed.
Based on a data partitioning model that supports both horizontal and vertical
fragmentation steps (or any combination of the two), XML collections are fragmented and distributed
across the sites of a distributed system.
Then, a suite of distributed query evaluation strategies is
proposed. These query evaluation techniques ensure locality between each fragment of the collection and
the parts of the query corresponding to the data in this fragment. Special attention is paid to
scalability and query performance, which is achieved by ensuring a high degree of parallelism
during distributed query evaluation and by avoiding access to irrelevant portions of the data.
For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides
several alternative approaches
for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is
necessary to predict and compare the expected performance of each of these alternatives. In this
work, this is accomplished through a query optimization technique based on a
distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is
fragmented to the demands of the query workload evaluated over this collection.
To evaluate the performance impact of the distributed query evaluation techniques proposed in this
thesis, the techniques were implemented within
a production-quality XML database system. Based on this implementation, a
thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation
techniques introduced here lead to significant improvements in query performance and scalability
both when compared to centralized techniques and when compared to existing distributed query
evaluation techniques.
|
2 |
Distributed XML Query ProcessingKling, Patrick January 2012 (has links)
While centralized query processing over collections of XML data stored at a single site is a well understood problem,
centralized query evaluation techniques are inherently limited in their scalability when presented
with large collections (or a single, large document) and heavy query workloads.
In the context of relational query processing,
similar scalability challenges have been overcome by partitioning data collections,
distributing them across the sites of a distributed system, and then
evaluating queries in a distributed fashion, usually in a way that ensures locality between
(sub-)queries and their relevant data.
This thesis presents a suite of query evaluation techniques for XML data that follow a similar
approach to address the scalability problems encountered by XML query evaluation.
Due to the significant differences in data and query models between relational and XML query
processing, it is not possible to directly apply distributed query evaluation techniques designed
for relational data to the XML scenario.
Instead, new distributed query evaluation
techniques need to be developed.
Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query
processing is proposed.
Based on a data partitioning model that supports both horizontal and vertical
fragmentation steps (or any combination of the two), XML collections are fragmented and distributed
across the sites of a distributed system.
Then, a suite of distributed query evaluation strategies is
proposed. These query evaluation techniques ensure locality between each fragment of the collection and
the parts of the query corresponding to the data in this fragment. Special attention is paid to
scalability and query performance, which is achieved by ensuring a high degree of parallelism
during distributed query evaluation and by avoiding access to irrelevant portions of the data.
For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides
several alternative approaches
for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is
necessary to predict and compare the expected performance of each of these alternatives. In this
work, this is accomplished through a query optimization technique based on a
distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is
fragmented to the demands of the query workload evaluated over this collection.
To evaluate the performance impact of the distributed query evaluation techniques proposed in this
thesis, the techniques were implemented within
a production-quality XML database system. Based on this implementation, a
thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation
techniques introduced here lead to significant improvements in query performance and scalability
both when compared to centralized techniques and when compared to existing distributed query
evaluation techniques.
|
3 |
Holistic Boolean Twig Pattern Matching for Efficient XML Query ProcessingDing, Dabin 01 May 2014 (has links)
Efficient twig pattern matching is essential to XML queries and other tree-based queries. Numerous so-called holistic algorithms have been proposed for efficiently processing the twig patterns in XML queries. However, a more general form of twig pattern, called Boolean-twig (or B-twig for short), which allows arbitrary combination of an arbitrary number of all the three logical connectives, AND, OR, and NOT, in a twig pattern, has not been adequately addressed. The theme of this study is on holistic (and efficient) B-twig pattern matching using region encoding and Dewey encoding schemes. We first adopt region encoding and propose a novel, direct approach called DBTwigMerge for holistic B-twig pattern matching, which although enjoys certain theoretical ``beauty'' and ``elegance'' but does not always outperform our prior approach, BTwigMerge. Based on the experience gained and in-depth investigation, we then come up with another new and more efficient approach, FBTwigMerge, which is proven to be the overall winner among all the holistic approaches using region encoding. In this study, we also studied the holistic B-twig pattern matching problem using Dewey encoding. The unique properties of Dewey encoding bring challenges and also benefits to this problem. By carefully addressing the challenges, this dissertation finally presents the first Dewey based holistic approach, called DeweyNOT, for efficiently solving the pattern matching problem with a subclass of B-twigs, i.e., twig queries involving arbitrary AND/NOT predicates. Extensive experimental studies have been conducted that demonstrate the viability and outstanding performance of the proposed approaches.
|
4 |
BINDING HASH TECHNIQUE FOR XML QUERY OPTIMIZATIONBRANT, MICHAEL J. 20 July 2006 (has links)
No description available.
|
Page generated in 0.0861 seconds