51 |
Concept hierarchies for extensible databasesBarnes, Christopher A. January 1990 (has links) (PDF)
Thesis (M.S. in Information Systems)--Naval Postgraduate School, September 1990. / Thesis Advisor(s): Dolk, Daniel R. Second Reader: Bradley, Gordon H. "September 1990." Description based on title screen viewed on December 16, 2009. DTIC Descriptor(s): Systems engineering, semantics, data bases, computers, numerical analysis, efficiency, theses, consistency, interrogation, numbers, hierarchies, sizes(dimensions), data management, dictionaries, measurement. Author(s) subject terms: Database design, data manipulation, semantic networks. Includes bibliographical references (p. 73-74). Also available in print.
|
52 |
GEMS Gossip-Enabled Monitoring Service for heterogeneous distributed systems /Raman, Pirabhu. January 2002 (has links)
Thesis (M.S.)--University of Florida, 2002. / Title from title page of source document. Includes vita. Includes bibliographical references.
|
53 |
Incomplete information in a deductive databaseKong, Qinzheng January 1989 (has links)
No description available.
|
54 |
URA : a universal data replication architectureZheng, Zheng, 1977- 10 September 2012 (has links)
Data replication is a key building block for large-scale distributed systems to improve availability, performance, and scalability. Because there is a fundamental trade-off between performance and consistency as well as between availability and consistency, systems must make trade-offs among these factors based on the demands and technologies of their target environments and workloads. Unfortunately, existing replication protocols and mechanisms are intrinsically entangled with specific policy assumptions. Therefore, to accommodate new trade-offs for new policy requirements, developers have to either build a new replication system from scratch or modify existing mechanisms. This dissertation presents a universal data replication architecture (URA) that cleanly separates mechanism and policy and supports Partial Replication (PR), Any Consistency (AC), and Topology Independence (TI) simultaneously. Our architecture yields two significant advantages. First, by providing a single set of mechanisms that capture the common underlying abstractions for data replication, URA can serve as a common substrate for building and deploying new replication systems. It therefore can significantly reduce the effort required to construct or modify a replication system. Second, by providing a set of general and flexible mechanisms independent of any specific policy, URA enables better trade-offs than any current system can provide. In particular, URA can simultaneously provide the three PRACTI properties while any existing system can provide at most two of them. Our experimental results and case-study systems confirm that universal data replication architecture is a way to build better replication systems and a better way to build replication systems. / text
|
55 |
Advanced analysis and join queries in multidimensional spacesGe, Shen., 葛屾. January 2012 (has links)
Multidimensional data are ubiquitous and their efficient management and analysis is a core database research problem. There are lots of previous works focusing on indexing, analyzing and querying multidimensional data. In this dissertation, three challenging advanced analysis and join problems in multidimensional spaces are proposed and studied, providing efficient solutions to their related applications.
First, the problem of generalized budget constrained optimization query (Gen-BOQ) is studied. In real life, it is often difficult for manufacturers to create new products dominating their competitors, due to some constraints. These constraints can be modeled by constraint functions, and the problem is then to decide the best possible regions in multidimensional spaces where the features of new products could be placed. Using the number of dominating and dominated objects, the profitability of these regions can be evaluated and the best areas are then returned. Although GenBOQ computation is challenging due to its high complexity, an efficient divide-and-conquer based framework is offered for this problem. In addition, an approximation method is proposed, making tradeoffs between the result quality and the query cost.
Next, the efficient evaluation of all top-k queries (ATOPk) in multidimensional spaces is investigated, which compute the top ranked objects for a group of preference functions simultaneously. As an application of such a query, consider an online store, which needs to provide recommendations for a large number of users simultaneously. This problem is somewhat overlooked by past research; in this thesis, batch algorithms are proposed instead of naïvely evaluating top-k queries individually. Similar preferences are grouped together, and two algorithms are proposed, using block indexed nested loops and a view-based thresholding strategy. The optimized view-based threshold algorithm is demonstrated to be consistently the best. Moreover, an all top-k query helps to evaluate other queries relying on the results of multiple top-k queries, such as reverse top-k queries and top-m influential queries proposed in previous works. It is shown that applying the view-based approach to these queries can improve the performance of the current state-of-the-art by orders of magnitude.
Finally, the problem of spatio-textual similarity joins (ST-SJOIN) on multidimensional data is considered. Given both spatial and textual information, ST-SJOIN retrieves pairs of objects which are both spatially close and textually similar. One possible application of this query is friendship recommendation, by matching people who not only live nearby but also share common interests. By combining the state-of-the-art strategies of spatial distance joins and set similarity joins, efficient query processing algorithms are proposed, taking both spatial and textual constraints into account. A batch processing strategy is also introduced to boost the performance, which is also effective for the original textual-only joins. Using synthetic and real datasets, it is shown that the proposed techniques outperform the baseline solutions. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
|
56 |
Integrating relational databases with the Semantic WebSequeda, Juan Federico 04 September 2015 (has links)
An early vision in Computer Science was to create intelligent systems ca- pable of reasoning on large amounts of data. Independent results in the areas of Description Logic and Relational Databases have advanced us towards this vision. Description Logic research has advanced the understanding of the tradeoff between the computational complexity of reasoning and the expressiveness of logic languages, and now underpins the Semantic Web. The Semantic Web comprises a graph data model (RDF), an ontology language for knowledge representation and reasoning (OWL) and a graph query language (SPARQL). Database research has advanced the theory and practice of management of data, embodying features such as views and recursion which are capable of representing reasoning. Despite the independent advances, the interface between Relational Databases and Semantic Web is poorly understood. This dissertation revisits this vision with respect to current technology and addresses the following question: How and to what extent can Relational Databases be integrated with the Semantic Web? The thesis is that much of the existing Relational Database infrastructure can be reused to support the Semantic Web. Two problems are studied. Can a Relational Database be automatically virtualized as a Semantic Web data source? This paradigm comprises a single Relational Database. The first contribution is an automatic direct mapping from a Relational Database schema and data to RDF and OWL. The second contribution is a method capable of evalu- ating SPARQL queries against the Relational Database, per the direct mapping, by exploiting two existing relational query optimizations. These contributions are embodied in a system called Ultrawrap. Empirical analysis consistently yield that SPARQL query execution performance on Ultrawrap is comparable to that of SQL queries written directly for the relational representation of the data. Such results have not been previously achieved. Can a Relational Database be mapped to existing Semantic Web ontologies and act as a reasoner? This paradigm comprises an OWL ontology including inheritance and transitivity, a Relational Database and mappings between the two. A third contribution is a method for Relational Databases to support inheritance and transitivity by compiling the ontology as mappings, implementing the mappings as SQL views, using SQL recursion and optimizing by materializing a subset of views. This contribution is implemented in an extension of Ultrawrap. Empirical analysis reveals that Relational Databases are able to effectively act as reasoners. / text
|
57 |
A grid enabled staging DBMS method for data Mapping, Matching & LoadingAhmed, Ejaz January 2011 (has links)
This thesis is concerned with the need to deal with data anomalies, inconsistencies and redundancies within the context of data integration in grids. A data Mapping, Matching and Loading (MML) process that is based on the Grid Staging Catalogue Service (MML-GSCATS) method is identified. In particular, the MML-GSCATS method consists of the development of two mathematical algorithms for the MML processes. Specifically it defines an intermediate data storage staging facility in order to process, upload and integrate data from various small to large size data repositories. With this in mind, it expands the integration notion of a database management system (DBMS) to include the MML-GSCATS method in traditional distributed and grid environments. The data mapping employed is in the form of value correspondences between source and target databases whilst data matching consolidates distinct catalogue schemas of federated databases to access information seamlessly. There is a need to deal with anomalies and inconsistencies in the grid, MML processes are applied using a healthcare case study with developed scenarios. These scenarios were used to test the MML-GSCATS method with the help of software prototyping toolkit. Testing has set benchmarks, performance, reliability and error detections (anomalies and redundancies). Cross-scenario data sets were formulated and results of scenarios were compared with benchmarking. These benchmarks help in comparing the MMLGSCATS methodology with traditional and current grid methods. Results from the testing and experiments demonstrate that the MML-GSCATS is a valid method for identifying data anomalies, inconsistencies and redundancies that are produced during loading. Testing results indicates the MML-GSCATS is better than traditional methods.
|
58 |
Automatic Construction of Networks of Concepts Characterizing Document DatabasesChen, Hsinchun, Lynch, K.J. January 1992 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / The results of a study that involved the creation of knowledge bases of concepts from large, operational textual
databases are reported. Two East-bloc computing knowledge
bases, both based on a semantic network structure, were created automatically using two statistical algorithms. With the help of four East-bloc computing experts, we evaluated the two knowledge bases in detail in a concept-association experiment based on recall and recognition tests. In the experiment, one of the knowledge bases that exhibited the asymmetric link property out-performed all four experts in recalling relevant concepts in East-bloc computing. The knowledge base, which contained about 20,O00 concepts (nodes) and 280,O00 weighted relationships
(links), was incorporated as a thesaurus-like component into an intelligent retrieval system. The system allowed users to perform semantics-based information management and information retrieval via interactive, conceptual relevance feedback.
|
59 |
Extending SGML to accommodate database functions: A Methodological OverviewSengupta, Arjit, Dillon, Andrew 07 1900 (has links)
A method for augmenting an SGML document repository system with database functionality is presented. SGML (ISO
8879, 1986) has been widely accepted as a standard language for writing text with added structural information
that gives the text greater applicability. Recently there
has been a trend to use this structural information as
metadata in databases. The complex structure of docuuments,
however, makes it difficult to directly map the structural information in documents to database structures. In particular, the flat nature of relational databases
makes it extremely difficult to model documents that are
inherently hierarchical in nature. Consequently, documents
are modeled in object-oriented databases (Abite-boul,
Cluet, & Milo, 1993), and object-relational databases
(Holst, 1995), in which SGML documents are mapped into the corresponding database models and are later reconstructed as necessary. However, this mapping strategy is not natural and can potentially cause loss of information in the original SGML documents. Moreover, interfaces for building queries for current document databases are mostly built on form-based query techniques and do not use the â â look and feelâ â of the documents. This article introduces an implementation method for a complex-object modeling technique specifically for SGML documents and describes interface techniques tailored for text databases. Some of the concepts for a Structured Document Database Management
System (SDDBMS) specifically designed for SGML documents are described. A small survey of some current products is also presented to demonstrate the need for such a system.
|
60 |
Inductive Query by Examples (IQBE): A Machine Learning ApproachChen, Hsinchun, She, Linlin January 1994 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / This paper presents an incremental, inductive learning approach to query-by examples for information retrieval (IR) and database management systems (DBMS). After briefly reviewing conventional information retrieval techniques and the prevailing database query paradigms, we introduce the ID5R algorithm, previously developed by Utgoff, for ``intelligent'' and system-supported query processing. We describe in detail how we adapted the ID5R algorithm for IR/DBMS applications and we present two examples, one for IR applications and the other for DBMS applications, to demonstrate the feasibility of the approach. Using a larger test collection of about 1000 document records from the COMPEN CD-ROM computing literature database and using recall as a performance measure, our experiment showed that the incremental ID5R performed significantly better than a batch inductive learning algorithm (called ID3) which we developed earlier. Both algorithms, however, were
robust and efficient in helping users develop abstract queries from examples. We believe this research has shed light on the feasibility and the novel characteristics of a new query paradigm, namely, inductive query-by examples
(IQBE). Directions of our current research are summarized at the end of the paper.
|
Page generated in 0.0417 seconds