Global ETD Search

71	The Completeness Problem of Ordered Relational Databases Jiang, Wei January 2010 (has links) Support of order in query processing is a crucial component in relational database systems, not only because the output of a query is often required to be sorted in a specific order, but also because employing order properties can significantly reduce the query execution cost. Therefore, finding an effective approach to answer queries over ordered data is important to the efficiency of query processing in relational databases. In this dissertation, an ordered relational database model is proposed, which captures both data tuples of relations and tuple ordering in relations. Based on this conceptual model, ordered relational queries are formally defined in a two-sorted first-order calculus, which serves as a yardstick to evaluate expressive power of other ordered query representations. The primary purpose of this dissertation is to investigate the expressive power of different ordered query representations. Particularly, the completeness problem of ordered relational algebras is studied with respect to the first-order calculus: does there exist an ordered algebra such that any first-order expressible ordered relational query can be expressed by a finite sequence of ordered operations? The significance of studying the completeness problem of ordered relational algebras is in that the completeness of ordered relational algebras leads to the possibility of implementing a finite set of ordered operators to express all first-order expressible ordered queries in relational databases. The dissertation then focuses on the completeness problem of ordered conjunctive queries. This investigation is performed in an incremental manner: first, the ordered conjunctive queries with data-decided order is considered; then, the ordered conjunctive queries with t-decided order is studied; finally, the completeness problem for the general ordered conjunctive queries is explored. The completeness theorem of ordered algebras is proven for all three classes of ordered conjunctive queries. Although this ordered relational database model is only conceptual, and ordered operators are not implemented in this dissertation, we do prove that a complete set of ordered operators exists to retrieve all first order expressible ordered queries in the three classes of ordered conjunctive queries. This research sheds light on the possibility of implementing a complete set of ordered operators in relational databases to solve the performance problem of order-relevant queries. relational databases Computer Science
72	INDEX STRUCTURES FOR XML DATABASES MOHAMMAD, SAMIR 16 March 2011 (has links) Extensible Markup Language (XML) is a de facto standard for data exchange in the World Wide Web. Indexing plays a key role in improving the execution of XML queries over that data. In this thesis we discuss the three main categories of indexes proposed in the literature to handle the XML semistructured data model, and identify limitations and open problems related to these indexing schemes. Based on our findings, we propose two novel XML index structures to overcome most of these limitations: a native index structure called Level-based Tree Index for XML databases (LTIX) and a relational index structure called Universal Index Structure for XML data (UISX). A proper labeling scheme is an essential part of a well-built XML index structure. We found that existing labeling schemes are not suitable for our index structures and therefore propose a novel labeling scheme, Level-based Labeling Scheme (LLS), which has the advantages of most popular types of labeling schemes while eliminating the main disadvantages. We then combine our LLS labeling scheme with our index structures. An evaluation shows that LLS performs well in comparison to existing labeling schemes using different mappings to relational tables. We propose the LTIX to minimize the number of joins and matches required to evaluate twig queries, and also to facilitate effective query optimization through early pruning of the space search. Our experimental results show that this approach performs well in comparison to existing state-of-the-art approaches. We propose the UISX to overcome the key problem with the state-of-the-art approaches, namely that they cannot support efficient processing of twig queries without requiring significant storage. We use a light-weight native XML engine on top of an SQL engine to perform the optimization related to the structure of the XML data prior to shredding. Experimental results show that our approach achieves lower response times than other similar approaches while using less space to store XML data. / Thesis (Ph.D, Computing) -- Queen's University, 2011-03-15 23:03:50.15 XML Indexes Databases
73	An In-memory Database for Prototyping Anomaly Detection Algorithms at Gigabit Speeds Friesen, Travis 11 September 2013 (has links) The growing speeds of computer networks are pushing the ability of anomaly detection algorithms and related systems to their limit. This thesis discusses the design of the Object Database, ODB, an analysis framework for evaluating anomaly detection algorithms in real time at gigabit or better speeds. To accomplish this, the document also discusses the construction a new dataset with known anomalies for verification purposes. Lastly, demonstrating the efficacy of the system required the implementation of an existing algorithm on the evaluation system and the demonstration that while the system is suitable for the evaluation of anomaly detection algorithms, this particular anomaly detection algorithm was deemed not appropriate for use at the packet-data level. Computer Security Databases
74	Mining frequent itemsets from uncertain data: extensions to constrained mining and stream mining Hao, Boyu 19 July 2010 (has links) Most studies on frequent itemset mining focus on mining precise data. However, there are situations in which the data are uncertain. This leads to the mining of uncertain data. There are also situations in which users are only interested in frequent itemsets that satisfy user-specified aggregate constraints. This leads to constrained mining of uncertain data. Moreover, floods of uncertain data can be produced in many other situations. This leads to stream mining of uncertain data. In this M.Sc. thesis, we propose algorithms to deal with all these situations. We first design a tree-based mining algorithm to find all frequent itemsets from databases of uncertain data. We then extend it to mine databases of uncertain data for only those frequent itemsets that satisfy user-specified aggregate constraints and to mine streams of uncertain data for all frequent itemsets. Experimental results show the effectiveness of all these algorithms. Data Mining Databases
75	Frequent pattern mining of uncertain data streams Jiang, Fan January 2011 (has links) When dealing with uncertain data, users may not be certain about the presence of an item in the database. For example, due to inherent instrumental imprecision or errors, data collected by sensors are usually uncertain. In various real-life applications, uncertain databases are not necessarily static, new data may come continuously and at a rapid rate. These uncertain data can come in batches, which forms a data stream. To discover useful knowledge in the form of frequent patterns from streams of uncertain data, algorithms have been developed to use the sliding window model for processing and mining data streams. However, for some applications, the landmark window model and the time-fading model are more appropriate. In this M.Sc. thesis, I propose tree-based algorithms that use the landmark window model or the time-fading model to mine frequent patterns from streams of uncertain data. Experimental results show the effectiveness of our algorithms. Data mining Databases
76	Mining frequent patterns from uncertain data with MapReduce Hayduk, Yaroslav 04 April 2012 (has links) Frequent pattern mining from uncertain data allows data analysts to mine frequent patterns from probabilistic databases, within which each item is associated with an existential probability representing the likelihood of the presence of the item in the transaction. When compared with precise data, the solution space for mining uncertain data is often much larger due to the probabilistic nature of uncertain databases. Thus, uncertain data mining algorithms usually take substantially more time to execute. Recent studies show that the MapReduce programming model yields significant performance gains for data mining algorithms, which can be mapped to the map and reduce execution phases of MapReduce. An attractive feature of MapReduce is fault-tolerance, which permits detecting and restarting failed jobs on working machines. In this M.Sc. thesis, I explore the feasibility of applying MapReduce to frequent pattern mining of uncertain data. Specifically, I propose two algorithms for mining frequent patterns from uncertain data with MapReduce. Data mining Databases
77	Incremental computation methods in valid and transaction time databases Aleksic, Mario January 1996 (has links) No description available. Database management Relational databases
78	Multicast communication in distributed systems with dynamic groups Belkeir, Nasr Eddine January 1991 (has links) No description available. Distributed databases Multichannel communication
79	Improving the performance and scalability of intermittently synchronized database systems Yee, Wai Gen January 2003 (has links) No description available. Databases Client/server computing
80	Comparative analysis of PropertyFirst vs. EntityFirst modeling approaches in graph databases 2015 March 1900 (has links) While relational databases still hold the primary position in the database technology domain, and have been for the longest time of any Computer Science technology has since its inception, for the first time the relational databases now have valid and worthy opponent in the NoSQL database movement. NoSQL databases, even though not many people have heard of them, with a significant number of Computer Science people included, have spread rapidly in many shapes and forms and have done so in quite a chaotic fashion. Similarly to the way they appeared and spread, design and modeling for them have been undertaken in an unstructured manner. Currently they are subcategorized in 4 main groups as: Key-value stores, Column Family stores, Document stores and Graph databases. In this thesis, different modeling approaches for graph databases, applied to the same domain are analyzed and compared, especially from a design perspective. The database selected here as the implemented technology is Neo4J by Neo Technologies and is a directed property graph database, which means that relationships between its data entities must have a starting and ending (or source and destination) node. This research provides an overview of two competing modeling approaches and evaluates them in a context of a real world example. The work done here shows that both of these modeling approaches are valid and that it is possible to fully develop a data model based on the same domain data with both approaches and that both can be used later to support application access in a similar fashion. One of the models provides for faster access to data, but at a cost of higher maintenance and increased complexity. NoSQL, Graph Databases, modeling

Search results