Global ETD Search

Return to search

Query processing in heterogeneous distributed database management systems

The goal of this work is to present an advanced query processing algorithm formulated and developed in support of heterogeneous distributed database management systems. Heterogeneous distributed database management systems view the integrated data through an uniform global schema. The query processing algorithm described here produces an inexpensive strategy for a query expressed over the global schema. The research addresses the following aspects of query processing: (1) Formulation of a low level query language to express the fundamental heterogeneous database operations; (2) Translation of the query expressed over the global schema to an equivalent query expressed over a conceptual schema; (3) An estimation methodology to derive the intermediate result sizes of the database operations; (4) A query decomposition algorithm to generate an efficient sequence of the basic database operations to answer the query. This research addressed the first issue by developing an algebraic query language called cluster algebra. The cluster algebra consists of the following operations: (a) Selection, union, intersection and difference, which are extensions of their relational algebraic counterparts to heterogeneous databases; (b) Normal-join and normal-projection which replace their counterparts, join and projection, in the relational algebra; (c) Two new operators embed and unembed to restructure the database schema. The second issue of the query translation was addressed by development of an algorithm that translates a cluster algebra query expressed over the virtual views to an equivalent cluster algebra query expressed over the conceptual databases. A non-parametric estimation methodology to estimate the result size of a cluster algebra operation was developed to address the third issue described above. Finally, this research developed a query decomposition algorithm, applicable to the relational and non-relational databases, that decomposes a query by computing all profitable semi-join operations, followed by the determination of the best sequence of join operations per processing site. The join optimization is performed by formulating a zero-one integer linear program that uses the non-parametric estimation technique to compute the sizes of intermediate results. The query processing algorithm was implemented in the context of DAVID, a heterogeneous distributed database management system. / Ph. D.

LD5655.V856 1992.B424

Database management

Distributed databases

Identifer	oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/39437
Date	20 September 2005
Creators	Bhasker, Bharat
Contributors	Computer Science, Egyhazy, Csaba J., Haddad, Emile K., Triantis, Konstantinos P., Hartson, H. Rex, Ricci, Fred J.
Publisher	Virginia Tech
Source Sets	Virginia Tech Theses and Dissertation
Language	English
Detected Language	English
Type	Dissertation, Text
Format	vi, 224 leaves, BTD, application/pdf, application/pdf
Rights	In Copyright, http://rightsstatements.org/vocab/InC/1.0/
Relation	OCLC# 26554378, LD5655.V856_1992.B424.pdf

Page generated in 0.0017 seconds

Query processing in heterogeneous distributed database management systems

Description

Links & Downloads

Tags

Additional Fields