Spelling suggestions: "subject:"databases"" "subject:"atabases""
71 |
The Completeness Problem of Ordered Relational DatabasesJiang, Wei January 2010 (has links)
Support of order in query processing is a crucial component in
relational database systems, not only because the output of a
query is often required to be sorted in a specific order, but also
because employing order properties can significantly reduce the
query execution cost. Therefore, finding an effective approach to
answer queries over ordered data is important to the efficiency of
query processing in relational databases.
In this dissertation, an ordered relational database model is
proposed, which captures both data tuples of relations and tuple
ordering in relations. Based on this conceptual model, ordered
relational queries are formally defined in a two-sorted first-order calculus, which serves as a yardstick to evaluate
expressive power of other ordered query representations.
The primary purpose of this dissertation is to investigate the
expressive power of different ordered query representations.
Particularly, the completeness problem of ordered relational
algebras is studied with respect to the first-order calculus:
does there exist an ordered algebra such that any first-order expressible ordered
relational query can be expressed by a finite sequence of ordered
operations? The significance of studying the completeness problem
of ordered relational algebras is in that the completeness of
ordered relational algebras leads to the possibility of
implementing a finite set of ordered operators to express all
first-order expressible ordered queries in relational databases.
The dissertation then focuses on the completeness problem of
ordered conjunctive queries. This investigation is performed in an
incremental manner: first, the ordered conjunctive queries with
data-decided order is considered; then,
the ordered conjunctive queries with t-decided order is
studied; finally, the completeness problem for the general ordered
conjunctive queries is explored. The completeness theorem
of ordered algebras is proven for all three classes of ordered
conjunctive queries.
Although this ordered relational database model is only
conceptual, and ordered operators are not implemented in this
dissertation, we do prove that a complete set of ordered operators
exists to retrieve all first order expressible ordered queries in
the three classes of ordered conjunctive queries. This research
sheds light on the possibility of implementing a complete set of
ordered operators in relational databases to solve the performance
problem of order-relevant queries.
|
72 |
INDEX STRUCTURES FOR XML DATABASESMOHAMMAD, SAMIR 16 March 2011 (has links)
Extensible Markup Language (XML) is a de facto standard for data exchange in the World Wide Web. Indexing plays a key role in improving the execution of XML queries over that data. In this thesis we discuss the three main categories of indexes proposed in the literature to handle the XML semistructured data model, and identify limitations and open problems related to these indexing schemes. Based on our findings, we propose two novel XML index structures to overcome most of these limitations: a native index structure called Level-based Tree Index for XML databases (LTIX) and a relational index structure called Universal Index Structure for XML data (UISX).
A proper labeling scheme is an essential part of a well-built XML index structure. We found that existing labeling schemes are not suitable for our index structures and therefore propose a novel labeling scheme, Level-based Labeling Scheme (LLS), which has the advantages of most popular types of labeling schemes while eliminating the main disadvantages. We then combine our LLS labeling scheme with our index structures. An evaluation shows that LLS performs well in comparison to existing labeling schemes using different mappings to relational tables.
We propose the LTIX to minimize the number of joins and matches required to evaluate twig queries, and also to facilitate effective query optimization through early pruning of the space search. Our experimental results show that this approach performs well in comparison to existing state-of-the-art approaches.
We propose the UISX to overcome the key problem with the state-of-the-art approaches, namely that they cannot support efficient processing of twig queries without requiring significant storage. We use a light-weight native XML engine on top of an SQL engine to perform the optimization related to the structure of the XML data prior to shredding. Experimental results show that our approach achieves lower response times than other similar approaches while using less space to store XML data. / Thesis (Ph.D, Computing) -- Queen's University, 2011-03-15 23:03:50.15
|
73 |
An In-memory Database for Prototyping Anomaly Detection Algorithms at Gigabit SpeedsFriesen, Travis 11 September 2013 (has links)
The growing speeds of computer networks are pushing the ability of anomaly detection algorithms and related systems to their limit. This thesis discusses the design of the Object Database, ODB, an analysis framework for evaluating anomaly detection algorithms in real time at gigabit or better speeds. To accomplish this, the document also discusses the construction a new dataset with known anomalies for verification purposes. Lastly, demonstrating the efficacy of the system required the implementation of an existing algorithm on the evaluation system and the demonstration that while the system is suitable for the evaluation of anomaly detection algorithms, this particular anomaly detection algorithm was deemed not appropriate for use at the packet-data level.
|
74 |
Mining frequent itemsets from uncertain data: extensions to constrained mining and stream miningHao, Boyu 19 July 2010 (has links)
Most studies on frequent itemset mining focus on mining precise data. However, there are situations in which the data are uncertain. This leads to the mining of uncertain data. There are also situations in which users are only interested in frequent itemsets that satisfy user-specified aggregate constraints. This leads to constrained mining of uncertain data. Moreover, floods of uncertain data can be produced in many other situations. This leads to stream mining of uncertain data. In this M.Sc. thesis, we propose algorithms to deal with all these situations. We first design a tree-based mining algorithm to find all frequent itemsets from databases of uncertain data. We then extend it to mine databases of uncertain data for only those frequent itemsets that satisfy user-specified aggregate constraints and to mine streams of uncertain data for all frequent itemsets. Experimental results show the effectiveness of all these algorithms.
|
75 |
Frequent pattern mining of uncertain data streamsJiang, Fan January 2011 (has links)
When dealing with uncertain data, users may not be certain about the presence of an item in the database. For example, due to inherent instrumental imprecision or errors, data collected by sensors are usually uncertain. In various real-life applications, uncertain databases are not necessarily static, new data may come continuously and at a rapid rate. These uncertain data can come in batches, which forms a data stream. To discover useful knowledge in the form of frequent patterns from streams of uncertain data, algorithms have been developed to use the sliding window model for processing and mining data streams. However, for some applications, the landmark window model and the time-fading model are more appropriate. In this M.Sc. thesis, I propose tree-based algorithms that use the landmark window model or the time-fading model to mine frequent patterns from streams of uncertain data. Experimental results show the effectiveness of our algorithms.
|
76 |
Mining frequent patterns from uncertain data with MapReduceHayduk, Yaroslav 04 April 2012 (has links)
Frequent pattern mining from uncertain data allows data analysts to mine frequent patterns from probabilistic databases, within which each item is associated with an existential probability representing the likelihood of the presence of the item in the transaction. When compared with precise data, the solution space for mining uncertain data is often much larger due to the probabilistic nature of uncertain databases. Thus, uncertain data mining algorithms usually take substantially more time to execute. Recent studies show that the MapReduce programming model yields significant performance gains for data mining algorithms, which can be mapped to the map and reduce execution phases of MapReduce. An attractive feature of MapReduce is fault-tolerance, which permits detecting and restarting failed jobs on working machines. In this M.Sc. thesis, I explore the feasibility of applying MapReduce to frequent pattern mining of uncertain data. Specifically, I propose two algorithms for mining frequent patterns from uncertain data with MapReduce.
|
77 |
Incremental computation methods in valid and transaction time databasesAleksic, Mario January 1996 (has links)
No description available.
|
78 |
Multicast communication in distributed systems with dynamic groupsBelkeir, Nasr Eddine January 1991 (has links)
No description available.
|
79 |
Improving the performance and scalability of intermittently synchronized database systemsYee, Wai Gen January 2003 (has links)
No description available.
|
80 |
Comparative analysis of PropertyFirst vs. EntityFirst modeling approaches in graph databases2015 March 1900 (has links)
While relational databases still hold the primary position in the database technology domain, and have been for the longest time of any Computer Science technology has since its inception, for the first time the relational databases now have valid and worthy opponent in the NoSQL database movement.
NoSQL databases, even though not many people have heard of them, with a significant number of Computer Science people included, have spread rapidly in many shapes and forms and have done so in quite a chaotic fashion. Similarly to the way they appeared and spread, design and modeling for them have been undertaken in an unstructured manner. Currently they are subcategorized in 4 main groups as: Key-value stores, Column Family stores, Document stores and Graph databases.
In this thesis, different modeling approaches for graph databases, applied to the same domain are analyzed and compared, especially from a design perspective.
The database selected here as the implemented technology is Neo4J by Neo Technologies and is a directed property graph database, which means that relationships between its data entities must have a starting and ending (or source and destination) node.
This research provides an overview of two competing modeling approaches and evaluates them in a context of a real world example.
The work done here shows that both of these modeling approaches are valid and that it is possible to fully develop a data model based on the same domain data with both approaches and that both can be used later to support application access in a similar fashion. One of the models provides for faster access to data, but at a cost of higher maintenance and increased complexity.
|
Page generated in 0.0382 seconds