Global ETD Search

1	Constructing Accurate Synopses for Database Query Optimization and Re-optimization Yu, Feng 01 May 2013 (has links) (PDF) Fast and accurate estimations for complex queries are profoundly beneficial for large databases with heavy workloads. The most widely adopted query optimizers use synopses to tune up the databases in manners of optimization and re-optimization. From Chapter 1 to Chapter 3, we focus on the synopses for query optimization. We propose a statistical summary for a database, called CS2 (Correlated Sample Synopsis), to provide rapid and accurate result size estimations for all queries with joins and arbitrary selections. Unlike the state-of-the-art techniques, CS2 does not completely rely on simple random samples, but mainly consists of correlated sample tuples that retain join relationships with less storage. We introduce a statistical technique, called reverse sample, and design an innovative estimator, called reverse estimator, to fully utilize correlated sample tuples for query estimation. We prove both theoretically and empirically that the reverse estimator is unbiased and accurate using CS2. Extensive experiments on multiple datasets show that CS2 is fast to construct and derives more accurate estimations than existing methods with the same space budget. In Chapter 4, we focus on the synopses for query re-optimization on repetitive queries. Repetitive queries refer to those queries that are likely to be executed repeatedly in the future, such as those that are used to generate periodic reports, perform routine maintenance, summarize data for analysis, etc. They can constitute a large part of daily activities of a database system and deserve more optimization efforts. In this paper, we propose to collect information about how tuples are joined in a query, called the query or join trace, during execution of a query. We intend to use this join trace to compute the selectivities of joins in all join orders for the query. We use existing operators, as well as new operators, to gather such information. We show that the trace gathered from a query is sufficient to compute the exact selectivities of all plans of the query. To reduce the overheads of generating a trace, we propose a sampling scheme that generates only a sample of the trace. Experimental results have shown that with only a small sample of the trace, accurate estimates of join selectivities can be obtained. The sample trace makes re-estimation of join selectivities of a repetitive query efficient and accurate. query estimation query optimization query processing sampling synopsis
2	PEDIGREE QUERY, VISUALIZATION, AND GENETIC CALCULATIONS TOOL Kurtcephe, Murat 27 August 2012 (has links) No description available. Computer Science pedigree pedigree query graphical query interface graph query
3	A database query language for operations on historical data Sadeghi, R. January 1987 (has links) No description available. 005 Historical Query Language
4	Data modelling, subtyping and functional programming Howells, William Gareth James January 1991 (has links) No description available. 005 Database query languages
5	GQuery - a natural language query system for geological databases Hassan, Hana Abbas January 1988 (has links) No description available. 005 Database query language
6	Multi-Mode Stream Processing For Hopping Window Queries Wei, Mingrui 06 May 2008 (has links) Window constraints are mechanisms to bound the tuples processed by continuous queries specified over unbounded data streams. While sliding window queries move the constraint window upon the arrival of each individual tuple, hopping window queries instead move the window by a fixed amount after some period, thus periodically refreshing their results. We observe that for large hops, techniques liked delta result updating may not be efficient -- as large portions of the tuples in the current window will be different from the previous window and thus must be maintained. On the other hand, the complete result updating technique, which has been found to be less suitable for sliding windows queries. Compute the next result based on the complete current window now can be shown to be superior in performance for some hopping windows queries. A trade-off emerges between the complete result method which has a lower per tuple processes cost but potentially processing redundant results versus the delta result method which has no redundant processing but pays a higher per tuple processing cost. On top of that, strict non-monotonic operators such as difference operator, cause premature expiration due to operator semantics. Negative tuples are needed for this kind of special expiration. Such negative tuples added extra burden to the stream engine. Thus, in streaming processing, the difference operator is typically suggested to be placed on top of the query plan despite its potential ability to reduce cardinality of the stream. With this thesis, we introduce a whole solution for hopping window query processing which includes an optimizer for generalized hopping window query optimization that exploits both processing techniques within one integrated query plan alone with query plan rewriting. First, we design the query operators to be multi-mode, that is, to be able to take either a delta or a complete result as input, and produce either a delta result or complete result as output. Then we design a cost model to be able to chose the optimal mode for each operator. Thirdly, our optimizer targets to configure each operator within a query plan to work in the suitable mode to achieve minimum overall processing costs. Last but not least, two query optimization techniques have been adopted. One explores all possibilities of pushing the difference down past joins using dynamic programming and assigning optimal mode at the same time, the other applies heuristic difference push down rule. The proposed techniques has been implemented within the WPI stream query engine, called CAPE. Finally, we show the benefit of our solution with a vast number of experimental results. stream process query optimization
7	Query Optimization for Database Federation Systems Wang, Di 04 May 2009 (has links) Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query optimization over distributed database system is that run-time conditions (namely available buffer size, CPU utilization in machine and network environment) can significantly affect the execution cost of a query plan. However, in existing database federation systems, very few studies have addressed run-time conditions. It is a challenging problem, because usually the mediator is not able to know the run-time conditions of remote sites and considering run-time conditions will bring about extra complexity to the optimizer. This thesis proposes the Cluster-and-Conquer algorithm for query optimization over database federation while efficiently considering run-time conditions. This algorithm has three-fold benefits. Firstly, the run-time conditions of machines are now available for cluster mediator. Secondly, each cluster mediator can deal with its own sub query concurrently, so the complexity of processing query plan is decreased. Thirdly, the algorithm outperforms other related approaches in terms of“cost of costing", because it removes unnecessary inter-cluster operations in the early stage. I have implemented a prototype data federation system with Cluster-and-Conquer algorithm. The experimental results showed the capabilities and efficiency of our algorithm and described the target scenarios where the algorithm performs better than other related approaches. database federation query optimization
8	Continuous monitoring of multi-dimensional queries / Mouratidis, Kyriakos. January 2006 (has links) Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2006. / Includes bibliographical references (leaves 144-146). Also available in electronic version. Query languages (Computer science)
9	On automated query modification techniques for databases Du, Kaizheng January 1993 (has links) No description available. automated query modification databases
10	Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data Ding, Yichen 15 April 2017 (has links) Mining spatio-temporal reachable regions aims to find a set of road segments from massive trajectory data, that are reachable from a user-specified location and within a given temporal period. Accurately extracting such spatio-temporal reachable area is vital in many urban applications, e.g., (i) location-based recommendation, (ii) location-based advertising, and (iii) business coverage analysis. The traditional approach of answering such queries essentially performs a distance-based range query over the given road network, which have two main drawbacks: (i) it only works with the physical travel distances, where the users usually care more about dynamic traveling time, and (ii) it gives the same result regardless of the querying time, where the reachable area could vary significantly with different traffic conditions. Motivated by these observations, in this thesis, we propose a data- driven approach to formulate the problem as mining actual reachable region based on real historical trajectory dataset. The main challenge in our approach is the system efficiency, as verifying the reachability over the massive trajectories involves huge amount of disk I/Os. In this thesis, we develop two indexing structures: 1) spatio-temporal index (ST-Index) and 2) connection index (Con-Index) to reduce redundant trajectory data access operations. We also propose a novel query processing algorithm with: 1) maximum bounding region search, which directly extracts a small searching region from the index structure and 2) trace back search, which refines the search results from the previous step to find the final query result. Moreover, our system can also efficiently answer the spatio-temporal reachability query with multiple query locations by skipping the overlapped area search. We evaluate our system extensively using a large-scale real taxi trajectory data in Shenzhen, China, where results demonstrate that the proposed algorithms can reduce 50%-90% running time over baseline algorithms. reachability query trajectory query spatio-temporal data management

Search results