Global ETD Search

601	Query processing in heterogeneous distributed database management systems Bhasker, Bharat 20 September 2005 (has links) The goal of this work is to present an advanced query processing algorithm formulated and developed in support of heterogeneous distributed database management systems. Heterogeneous distributed database management systems view the integrated data through an uniform global schema. The query processing algorithm described here produces an inexpensive strategy for a query expressed over the global schema. The research addresses the following aspects of query processing: (1) Formulation of a low level query language to express the fundamental heterogeneous database operations; (2) Translation of the query expressed over the global schema to an equivalent query expressed over a conceptual schema; (3) An estimation methodology to derive the intermediate result sizes of the database operations; (4) A query decomposition algorithm to generate an efficient sequence of the basic database operations to answer the query. This research addressed the first issue by developing an algebraic query language called cluster algebra. The cluster algebra consists of the following operations: (a) Selection, union, intersection and difference, which are extensions of their relational algebraic counterparts to heterogeneous databases; (b) Normal-join and normal-projection which replace their counterparts, join and projection, in the relational algebra; (c) Two new operators embed and unembed to restructure the database schema. The second issue of the query translation was addressed by development of an algorithm that translates a cluster algebra query expressed over the virtual views to an equivalent cluster algebra query expressed over the conceptual databases. A non-parametric estimation methodology to estimate the result size of a cluster algebra operation was developed to address the third issue described above. Finally, this research developed a query decomposition algorithm, applicable to the relational and non-relational databases, that decomposes a query by computing all profitable semi-join operations, followed by the determination of the best sequence of join operations per processing site. The join optimization is performed by formulating a zero-one integer linear program that uses the non-parametric estimation technique to compute the sizes of intermediate results. The query processing algorithm was implemented in the context of DAVID, a heterogeneous distributed database management system. / Ph. D. LD5655.V856 1992.B424 Database management Distributed databases
602	U/Pb Zircon Ages of Plutons from the Central Appalachians and GIS-Based Assessment of Plutons with Comments on Their Regional Tectonic Significance Wilson, John Robert 08 October 2001 (has links) The rocks of the Appalachian orogen are world-class examples of collisional and extensional tectonics, where multiple episodes of mountain building and rifting from the pre-Cambrian to the present are preserved in the geologic record. These orogenic events produced plutonic rocks, which can be used as probes of the thermal state of the source region. SIMS (secondary ion mass spectrometry) U/Pb ages of zircons were obtained for ten plutons (Leatherwood, Rich Acres, Melrose, Buckingham, Diana Mills, Columbia, Poore Creek, Green Springs, Lahore and Ellisville) within Virginia. These plutons are distinct chemically, isotopically, and show an age distribution where felsic rocks are approximately 440 Ma, and Mafic rocks are approximately 430 Ma. Initial strontium isotopic ratios and bulk geochemical analyses were also performed. These analyses show the bimodal nature of magmatism within this region. In order to facilitate management of geologic data, including radiometric ages, strontium isotope initial ratios and major element geochemistry, a GIS based approach has been developed. Geospatially references sample locations, and associated attribute data allow for analysis of the data, and an assessment of the accuracy of field locations of plutons at both regional and local scales. The GIS based assessment of plutons also allows for the incorporation of other multidisciplinary databases to enhance analysis of regional and local geologic processes. Extending such coverage to the central Appalachians (distribution of lithotectonic belts, plutons, and their ages and compositions) will enable a rapid assessment of tectonic models. / Master of Science GIS Central Appalachians Plutons U/Pb Ages Databases
603	PathMeld: A Methodology for The Unification of Metabolic Pathway Databases Rajasimha, Harsha Karur 29 December 2004 (has links) A biological pathway database is a database that describes biochemical pathways, reactions, enzymes that catalyze the reactions, and the substrates that participate in these reactions. A pathway genome database (PGDB) integrates pathway information with information about the complete genome of various sequenced organisms. Two of the popular PGDBs available today are the Kyoto Encyclopedia of Genes and Genomes (KEGG) and MetaCyc. The proliferation of biological databases in general raises several questions for the life scientist. Which of these databases is most accurate, most current, or most comprehensive? Do they have a standard format? Do they complement each other? Overall, which database should be used for what purpose? If more than one database is deemed relevant, it is desirable to have a unified database containing information from all the shortlisted databases. There is no standard methodology yet for integrating biological pathway databases and, to the best of our knowledge, no commercial software that can perform such integration tasks. While XML based pathway data exchange standards such as BioPAX and SBML are emerging, these do not address the basic problems such as inconsistent nomenclature and substrate matching between databases in the unification of pathway databases. Here, we present the PathMeld methodology to unify KEGG and MetaCyc databases starting from their flat files. Individual PGDBs are transformed into a unified schema that we design. With individual PGDBs in the common unified schema, the key to the PathMeld methodology is to find the entity correspondences between the KEGG and MetaCyc substrates. We present a heuristic driven approach for one-to-one mapping of the substrates between KEGG and MetaCyc. Using the exact name and chemical formula match criteria, 82.6% of the substrates in MetaCyc were matched accurately to corresponding substrates in KEGG. The substrate names in the MetaCyc database contain html tags and non-characters such as <sub>, <sup>, <i>, <l>, &, and $. The MetaCyc chemical formula are stored in lisp format in the database while KEGG stores them as continuous strings. Hence, we subject MetaCyc chemical formulae to transformation into KEGG format to make them directly comparable. Applying pre-processing to transform MetaCyc substrate names and formulae improved substrate matching by 2%. To investigate how many of the remaining 17:4% substrates are indeed absent from KEGG, we employ a standard UNIX based approximate string matching tool called agrep. The resulting matches are curated into four mutually exlusive groups: 3:83% are correct matches, 3:17% are close matches, and 7:45% are incorrect matches. 3:68% of MetaCyc substrate names are not matched at all. This shows that 11:13% of MetaCyc substrate names are absent in KEGG. We note some of the implementation issues we solved. First, parsing only one flat file to populate one database table is not sufficient. Second, intermediate database tables are needed. Third, transformation of substrate names and chemical formula from one of the component databases is required for comparison. Fourth, a biochemist's intervention is needed in evaluating the approximate substrate matches from agrep. In conclusion, the PathMeld methodology successfully uniÂ¯es KEGG and MetaCyc °at Â¯le databases into a uniÂ¯ed PostgreSQL database. Matching substrates between databases is the key issue in the uniÂ¯cation process. About 83% of the substrate correspondences can be computationally achieved, while the remaining 17% substrates require approximate matching and manual curation by a biochemist. We presented several di®erent techniques for substrate matching and showed that about 10% of the MetaCyc substrates do not match and hence are absent from KEGG. / Master of Science KEGG EcoCyc integration unification PGDB PathMeld MetaCyc metabolic pathway databases
604	Optimizing Distributed Transactions: Speculative Client Execution, Certified Serializability, and High Performance Run-Time Pandey, Utkarsh 01 September 2016 (has links) On-line services already form an important part of modern life with an immense potential for growth. Most of these services are supported by transactional systems, which are backed by database management systems (DBMS) in many cases. Many on-line services use replication to ensure high-availability, fault tolerance and scalability. Replicated systems typically consist of different nodes running the service co-ordinated by a distributed algorithm which aims to drive all the nodes along the same sequence of states by providing a total order to their operations. Thus optimization of both local DBMS operations through concurrency control and the distributed algorithm driving replicated services can lead to enhancing the performance of the on-line services. Deferred Update Replication (DUR) is a well-known approach to design scalable replicated systems. In this method, the database is fully replicated on each distributed node. User threads perform transactions locally and optimistically before a total order is reached. DUR based systems find their best usage when remote transactions rarely conflict. Even in such scenarios, transactions may abort due to local contention on nodes. A generally adopted method to alleviate the local contention is to invoke a local certification phase to check if a transaction conflicts with other local transactions already completed. If so, the given transaction is aborted locally without burdening the ordering layer. However, this approach still results in many local aborts which significantly degrades the performance. The first main contribution of this thesis is PXDUR, a DUR based transactional system, which enhances the performance of DUR based systems by alleviating local contention and increasing the transaction commit rate. PXDUR alleviates local contention by allowing speculative forwarding of shared objects from locally committed transactions awaiting total order to running transactions. PXDUR allows transactions running in parallel to use speculative forwarding, thereby enabling the system to utilize the highly parallel multi-core platforms. PXDUR also enhances the performance by optimizing the transaction commit process. It allows the committing transactions to skip read-set validation when it is safe to do so. PXDUR achieves performance gains of an order of magnitude over closest competitors under favorable conditions. Transactions also form an important part of centralized DBMS, which tend to support multi-threaded access to utilize the highly parallel hardware platforms. The applications can be wrapped in transactions which can then access the DBMS as per the rules of concurrency control. This allows users to develop applications that can run on DBMSs without worrying about synchronization. texttt{Serializability} is the de-facto standard form of isolation required by transactions for many applications. The existing methods employed by DBMSs to enforce serializability employ explicit fine-grained locking. The eager-locking based approach is pessimistic and can be too conservative for many applications. The locking approach can severely limit the performance of DBMSs especially for scenarios with moderate to high contention. This leads to the second major contribution of this thesis is TSAsR, an adaptive transaction processing framework, which can be applied to DBMSs to improve performance. TSAsR allows the DBMS's internal synchronization to be more relaxed and enforces serializability through the processng of external meta-data in an optimistic manner. It does not require any changes in the application code and achieves orders of magnitude performance improvements for high and moderate contention cases. The replicated transaction processing systems require a distributed algorithm to keep the system consistent by ensuring that each node executes the same sequence of deterministic commands. These algorithms generally employ texttt{State Machine Replication (SMR)}. Enhancing the performance of such algorithms is a potential way to increase the performance of distributed systems. However, developing new SMR algorithms is limited in production settings because of the huge verification cost involved in proving their correctness. There are frameworks that allow easy specification of SMR algorithms and subsequent verification. However, algorithms implemented in such framework, give poor performance. This leads to the third major contribution of this thesis Verified JPaxos, a JPaxos based runtime system which can be integrated to an easy to verify I/O automaton based on Multipaxos protocol. Multipaxos is specified in Higher Order Logic (HOL) for ease of verification which is used to generates executable code representing the Multipaxos state changes (I/O Automaton). The runtime drives the HOL generated code and interacts with the service and network to create a fully functional replicated Multipaxos system. The runtime inherits its design from JPaxos along with some optimizations. It achieves significant improvement over a state-of-art SMR verification framework while still being comparable to the performance of non-verified systems. / Master of Science Distributed systems Transactions Databases Verification Run-time Concurrency Paxos Replication.
605	DMAS:A Display Measurement and Analysis System with an object-oriented database Qian, Yihong 11 June 2009 (has links) Current commercial measurement systems are used primarily for performing measurements and recording data. Measurement users either expend extra effort to store and maintain other measurement information (metadata) or to customize the measurement system to make it functionally complete. A software measurement environment using advanced data management techniques in an open architecture seems highly desirable. To create such an environment, a Display Measurement and Analysis System (DMAS) was designed and constructed using the object-oriented paradigm and object-oriented database (OODB) management techniques. The purpose of the system is to serve as a testbed for new generation measurement systems and for overcoming the limitations of conventional systems. This thesis proposes a new object data model for display measurement and analysis applications. The components of this data model are object classes. The generation of the data model involved four steps, dealing with: objects and classes at given level of abstractions, semantics, and relationships. A prototype system based on the above model has been developed. It used an object data management system as the support of persistent object storage. The development of DMAS database management subsystem consists of construction of an object schema and an object management interface. The research work illustrates that the OODB approach facilitates scientific measurement by capturing metadata and data together explicitly and flexibly. Furthermore, it show thats OODB has the ability to represent complex semantics, to associate objects with metadata, and to map a lucid interface easily to objects. / Master of Science LD5655.V855 1993.Q226 DMAS Measurement Object-oriented databases
606	Quantifying the Safety Impacts of Intelligent Transportation Systems Avgoustis, Alexis 02 June 1999 (has links) An average of 6.5 million crashes are reported to the police every year in the United States. Safety is significantly important considering the rapid increase on traffic volume on American roads. This thesis describes the development of a safety model whose primary objective is to capture the benefits of Intelligent Transportation Systems (ITS) on safety. The specific ITS component that is examined in more detail is traffic signal coordination. The model was tested in a micro-simulation environment using INTEGRATION traffic simulation model as well as in a field data evaluation. The General Estimates System (GES) database was chosen as the primary national database to extract accident data. These data were used for the development of the statistical foundation for the safety model. Crash rates were produced using extracted crash frequencies and annual vehicle miles traveled figures from the Highway Statistics (FHWA, 1997). Regression analysis was performed to predict the behavior of several crash types, as they were associated with a variety of variables, for example the facility speed limit and time the crash occurred. The model was developed in FORTRAN code that estimates the accident risk of a facility based on its free-speed. Two methods were used to test the model: 1. field data from the city of Phoenix, Arizona were used in a GPS (Global Positioning Systems) floating car that tracked the accident risk on a second by second basis. Before and after signal coordination scenarios were tested thus yielding a result that the accident risk is less in the after scenario. 2. the model was then tested in a micro-simulation environment using the INTEGRATION traffic model. A hypothetical network, as well as the Scottsdale/Rural road corridor in Phoenix were used. The sensitivity analysis of before and after signal coordination scenarios indicated that after the signals were coordinated, the crash risk was lower, thus proving that the model could capture the benefits of this ITS component. Reducing the number of crashes is an important aspect of improving safety. Traffic signal coordination smoothens traffic on a facility and reduces its potential accident risk by producing less vehicle-to vehicle interactions. Also, traffic signal control increases the free-speed of a facility. The advantage of this safety model is the fact that it can be used to capture a variety of ITS technologies and not only signal coordination that is examined in more detail in this thesis. / Master of Science ITS safety models accident rates Accident databases signal coordination
607	The Effects of Data Models and Conceptual Models of the Structured Query Language on the Task of Query Writing by End Users Wu, Bruce Jiinpo 08 1900 (has links) This research is an empirical investigation of human factors on the use of database systems. The problem motivating the study is the difficulty encountered by end-users in retrieving data from a database. human-computer interaction data models databases
608	Modeling imprecise time intervals in temporal databases Cheng, Xin 01 April 2001 (has links) No description available. Database management Temporal databases Electrical and Computer Engineering Engineering Systems and Communications
609	Automated generation of XML documents for data transportation between relational database DTDS Wang, Lu 01 April 2001 (has links) No description available. Relational databases XML (Document markup language) Electrical and Computer Engineering Engineering
610	Scalable technologies for distributed multimedia systems Sheu, Fenn Huei (Simon) 01 January 1999 (has links) No description available. Distributed databases Multimedia systems Computer Sciences Physical Sciences and Mathematics

Search results