Global ETD Search

11	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
12	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
13	Factorisation in relational databases Zavodny, Jakub January 2014 (has links) We study representation systems for relational data based on relational algebra expressions with unions, products, and singleton relations. Algebraic factorisation using the distributivity of product over union allows succinct representation of many-to-many relationships; further succinctness is brought by sharing repeated subexpressions. We show that these techniques are especially applicable to results of conjunctive queries. In the first part of the dissertation we derive tight asymptotic size bounds for two flavours of factorised representations of results of conjunctive queries. Any conjunctive query is characterised by rational parameters that govern the factorisability of its results independently of the database instance. We relate these parameters to fractional edge covers and fractional hypertree decompositions. Factorisation naturally extends from relational data to its provenance. We characterise conjunctive queries by tight bounds on their readability, which captures how many times each input tuple is used to contribute to an output tuple, and we define syntactically the class of queries with bounded readability. In the second part of the dissertation we describe FDB, a relational database engine that uses factorised representations at the physical layer to reduce data redundancy and boost query performance. We develop algorithms for optimisation and evaluation of queries with selection, projection, join, aggregation and order-by clauses on factorised representations. By introducing novel operators for factorisation restructuring and a new optimisation objective to maintain intermediate and final results succinctly factorised, we allow query evaluation with lower time complexity than on flat relations. Experiments show that for data sets with many-to-many relationships, FDB can outperform relational engines by orders of magnitude. 005.75
14	Main-Memory Query Processing Utilizing External Indexes Truong, Thanh January 2016 (has links) Many applications require storage and indexing of new kinds of data in main-memory, e.g. color histograms, textures, shape features, gene sequences, sensor readings, or financial time series. Even though, many domain index structures were developed, very a few of them are implemented in any database management system (DBMS), usually only B-trees and hash indexes. A major reason is that the manual effort to include a new index implementation in a regular DBMS is very costly and time-consuming because it requires integration with all components of the DBMS kernel. To alleviate this, there are some extensible indexing frameworks. However, they all require re-engineering the index implementations, which is a problem when the index has third-party ownership, when only binary code is available, or simply when the index implementation is complex to re-engineer. Therefore, the DBMS should allow including new index implementations without code changes and performance degradation. Furthermore, for high performance the query processor needs knowledge of how to process queries to utilize plugged-in index. Moreover, it is important that all functionalities of a plugged-in index implementation are correct. The extensible main memory database system (MMDB) Mexima (Main-memory External Index Manager) addresses these challenges. It enables transparent plugging in main-memory index implementations without code changes. Index specific rewrite rules transform complex queries to utilize the indexes. Automatic test procedures validate the correctness of them based on user provided index meta-data. Moreover, the same optimization framework can also optimize complex queries sent to a back-end DBMS by exposing hidden indexes for its query optimizer. Altogether, Mexima is a complete and extensible platform for transparently index integration, utilization, and evaluation. Database indexing query processing index structures main-memory index validation
15	Distributed Graph Storage And Querying System Balaji, Janani 12 August 2016 (has links) Graph databases offer an efficient way to store and access inter-connected data. However, to query large graphs that no longer fit in memory, it becomes necessary to make multiple trips to the storage device to filter and gather data based on the query. But I/O accesses are expensive operations and immensely slow down query response time and prevent us from fully exploiting the graph specific benefits that graph databases offer. The storage models of most existing graph database systems view graphs as indivisible structures and hence do not allow a hierarchical layering of the graph. This adversely affects query performance for large graphs as there is no way to filter the graph on a higher level without actually accessing the entire information from the disk. Distributing the storage and processing is one way to extract better performance. But current distributed solutions to this problem are not entirely effective, again due to the indivisible representation of graphs adopted in the storage format. This causes unnecessary latency due to increased inter-processor communication. In this dissertation, we propose an optimized distributed graph storage system for scalable and faster querying of big graph data. We start with our unique physical storage model, in which the graph is decomposed into three different levels of abstraction, each with a different storage hierarchy. We use a hybrid storage model to store the most critical component and restrict the I/O trips to only when absolutely necessary. This lets us actively make use of multi-level filters while querying, without the need of comprehensive indexes. Our results show that our system outperforms established graph databases for several class of queries. We show that this separation also eases the difficulties in distributing graph data and go on propose a more efficient distributed model for querying general purpose graph data using the Spark framework. Graph Databases Distributed Graph Databases Distributed Graph Query Processing Spark
16	State-Slice: A New Stream Query Optimization Paradigm for Multi-query and Distributed Processing Wang, Song 25 March 2008 (has links) Modern stream applications necessitate the handling of large numbers of continuous queries specified over high volume data streams. This dissertation proposes novel solutions to continuous query optimization in three core areas of stream query processing, namely state-slice based multiple continuous query sharing, ring-based multi-way join query distribution and scalable distributed multi-query optimization. The first part of the dissertation proposes efficient optimization strategies that utilize the novel state-slicing concept to achieve maximum memory and computation sharing for stream join queries with window constraints. Extensive analytical and experimental evaluations demonstrate that our proposed strategies is capable to minimize the memory or CPU consumptions for multiple join queries. The second part of this dissertation proposes a novel scheme for the distributed execution of generic multi-way joins with window constraints. The proposed scheme partitions the states into disjoint slices in the time domain, and then distributes the fine-grained states in the cluster, forming a virtual computation ring. New challenges to support this distributed state-slicing processing are answered by numerous new techniques. The extensive experimental evaluations show that the proposed strategies achieve significant performance improvements in terms of response time and memory usages for a wide range of configurations and workloads on a real system. Ring based distributed stream query processing and multi-query sharing both are based on the state-slice concept. The third part of this dissertation combines the first two parts of this dissertation work and proposes a novel distributed multi-query optimization technique. database optimization stream query processing Querying (Computer science)
17	Robust Complex Event Pattern Detection over Streams Li, Ming 04 April 2010 (has links) Event stream processing (ESP) has become increasingly important in modern applications. In this dissertation, I focus on providing a robust ESP solution by meeting three major research challenges regarding the robustness of ESP systems: (1) while event constraint of the input stream is available, applying such semantic information in the event processing; (2) handling event streams with out-of-order data arrival and (3) handling event streams with interval-based temporal semantics. The following are the three corresponding research tasks completed by the dissertation: Task I - Constraint-Aware Complex Event Pattern Detection over Streams. In this task, a framework for constraint-aware pattern detection over event streams is designed, which on the fly checks the query satisfiability / unsatisfiability using a lightweight reasoning mechanism and adjusts the processing strategy dynamically by producing early feedback, releasing unnecessary system resources and terminating corresponding pattern monitor. Task II - Complex Event Pattern Detection over Streams with Out-of-Order Data Arrival. In this task, a mechanism to address the problem of processing event queries specified over streams that may contain out-of-order data is studied, which provides new physical implementation strategies for the core stream algebra operators such as sequence scan, pattern construction and negation filtering. Task III - Complex Event Pattern Detection over Streams with Interval-Based Temporal Semantics. In this task, an expressive language to represent the required temporal patterns among streaming interval events is introduced and the corresponding temporal operator ISEQ is designed. event stream constraint database CEP interval pattern detection query processing
18	Query Processing for Peer Mediator Databases Katchaounov, Timour January 2003 (has links) <p>The ability to physically interconnect many distributed, autonomous and heterogeneous software systems on a large scale presents new opportunities for sharing and reuse of existing, and for the creataion of new information and new computational services. However, finding and combining information in many such systems is a challenge even for the most advanced computer users. To address this challenge, mediator systems logically integrate many sources to hide their heterogeneity and distribution and give the users the illusion of a single coherent system.</p><p>Many new areas, such as scientific collaboration, require cooperation between many autonomous groups willing to share their knowledge. These areas require that the data integration process can be distributed among many autonomous parties, so that large integration solutions can be constructed from smaller ones. For this we propose a decentralized mediation architecture, peer mediator systems (PMS), based on the peer-to-peer (P2P) paradigm. In a PMS, reuse of human effort is achieved through logical composability of the mediators in terms of other mediators and sources by defining mediator views in terms of views in other mediators and sources.</p><p>Our thesis is that logical composability in a P2P mediation architecture is an important requirement and that composable mediators can be implemented efficiently through query processing techniques.</p><p>In order to compute answers of queries in a PMS, logical mediator compositions must be translated to query execution plans, where mediators and sources cooperate to compute query answers. The focus of this dissertation is on query processing methods to realize composability in a PMS architecture in an efficient way that scales over the number of mediators.</p><p>Our contributions consist of an investigation of the interfaces and capabilities for peer mediators, and the design, implementation and experimental study of several query processing techniques that realize composability in an efficient and scalable way.</p> Datalogi data integration mediators query processing Datalogi Computer science Datalogi
19	Query Processing for Peer Mediator Databases Katchaounov, Timour January 2003 (has links) The ability to physically interconnect many distributed, autonomous and heterogeneous software systems on a large scale presents new opportunities for sharing and reuse of existing, and for the creataion of new information and new computational services. However, finding and combining information in many such systems is a challenge even for the most advanced computer users. To address this challenge, mediator systems logically integrate many sources to hide their heterogeneity and distribution and give the users the illusion of a single coherent system. Many new areas, such as scientific collaboration, require cooperation between many autonomous groups willing to share their knowledge. These areas require that the data integration process can be distributed among many autonomous parties, so that large integration solutions can be constructed from smaller ones. For this we propose a decentralized mediation architecture, peer mediator systems (PMS), based on the peer-to-peer (P2P) paradigm. In a PMS, reuse of human effort is achieved through logical composability of the mediators in terms of other mediators and sources by defining mediator views in terms of views in other mediators and sources. Our thesis is that logical composability in a P2P mediation architecture is an important requirement and that composable mediators can be implemented efficiently through query processing techniques. In order to compute answers of queries in a PMS, logical mediator compositions must be translated to query execution plans, where mediators and sources cooperate to compute query answers. The focus of this dissertation is on query processing methods to realize composability in a PMS architecture in an efficient way that scales over the number of mediators. Our contributions consist of an investigation of the interfaces and capabilities for peer mediators, and the design, implementation and experimental study of several query processing techniques that realize composability in an efficient and scalable way. data integration mediators query processing Computer science Datalogi
20	KNN Query Processing in Wireless Sensor and Robot Networks Xie, Wei 28 February 2014 (has links) In Wireless Sensor and Robot Networks (WSRNs), static sensors report event information to one of the robots. In the k nearest neighbour query processing problem in WSRNs, the robot receives event report needs to find exact k nearest robots (KNN) to react to the event, among those connected to it. We are interested in localized solutions, which avoid message flooding to the whole network. Several existing methods restrict the search within a predetermined boundary. Some network density-based estimation algorithms were proposed but they either result in large message transmission or require the density information of the whole network in advance which is complex to implement and lacks robustness. Algorithms with tree structures lead to the excessive energy consumption and large latency caused by structural construction. Itinerary based approaches generate large latency or unsatisfactory accuracy. In this thesis, we propose a new method to estimate a search boundary, which is a circle centred at the query point. Two algorithms are presented to disseminate the message to robots of interest and aggregate their data (e.g. the distance to query point). Multiple Auction Aggregation (MAA) is an algorithm based on auction protocol, with multiple copies of query message being disseminated into the network to get the best bidding from each robot. Partial Depth First Search (PDFS) attempts to traverse all the robots of interest with a query message to gather the data by depth first search. This thesis also optimizes a traditional itinerary-based KNN query processing method called IKNN and compares this algorithm with our proposed MAA and PDFS algorithms. The experimental results followed indicate that the overall performance of MAA and PDFS outweighs IKNN in WSRNs. Wireless Sensor and Robot Networks Wireless Sensor Networks KNN Query Processing

Search results