The rise of XML as a de facto standard for document and data exchange has created a need to store and query XML documents in relational databases, today's de facto standard for data storage. Two common strategies for storing XML documents in relational databases, a process known as document shredding, are Interval encoding and ORDPATH Encoding. Interval encoding, which uses a fixed mapping for shredding XML documents, tends to favor selection queries, at a potential cost of O(N) for supporting insertion queries. ORDPATH Encoding, which uses a looser mapping for shredding XML, supports fixed-cost insertions, at a potential cost of longer-running selection queries. Experiments conducted for this research suggest that the breakeven point between the two algorithms occurs when users offer an average 1 insertion to every 5.6 queries, relative to documents of between 1.5 MB and 4 MB in size. However, heterogeneous tests of varying mixes of selects and inserts indicate that Interval always outperforms ORDPATH for mixes ranging from 76% selects to 88% selects. Queries for this experiment and sample documents were drawn from the XMark benchmark suite.
Identifer | oai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-3577 |
Date | 06 May 2006 |
Creators | Leonard, Jonathan Lee |
Publisher | Digital Commons @ East Tennessee State University |
Source Sets | East Tennessee State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Electronic Theses and Dissertations |
Rights | Copyright by the authors. |
Page generated in 0.0018 seconds