Global ETD Search

101	MTopS: Multi-Query Optimization for Continuous Top-K Query Workloads Shastri, Avani 05 May 2011 (has links) A continuous top-k query retrieves the k most preferred objects from a data stream according to a given preference function. These queries are important for a broad spectrum of applications from web-based advertising, network traffic monitoring, to financial analysis. Given the nature of such applications, a data stream may be subjected at any given time to multiple top-k queries with varying parameter settings requested simultaneously by different users. This workload of simultaneous top-k queries must be executed efficiently to assure real time responsiveness. However, existing methods in the literature focus on optimizing single top-k query processing, thus would handle each query independently. They are thus not suitable for handling large numbers of such simultaneous top-k queries due to their unsustainable resource demands. In this thesis, we present a comprehensive framework, called MTopS for Multiple Top-K Optimized Processing System. MTopS achieves resource sharing at the query level by analyzing parameter settings of all queries in the workload, including window-specific parameters and top-k parameters. We further optimize the shared processing by identifying the minimal object set from the data stream that is both necessary and sufficient for top-k monitoring of all queries in the workload. Within this framework, we design the MTopBand algorithm that maintains the up-to-date top-k result set in the size of O (k), where k is the required top-k result set, eliminating the need for any recomputation. To overcome the overhead caused by MTopBand to maintain replicas of the top-k result set across sliding windows, we optimize this algorithm further by integrating these views into one integrated structure, called MTopList. Our associated top-k maintenance algorithm, also called MTopList algorithm, is able to maintain this linear integrated structure, thus able to efficiently answer all queries in the workload. MTopList is shown to be memory optimal because it maintains only the distinct objects that are part of top-k results of at least one query. Our experimental study, using real data streams from domains of stock trades and moving object monitoring, demonstrates that both the efficiency and scalability in the query workload of our proposed technique is superior to the state-of-the-art solutions. meta query strategy MTopS predicted top-k results MTopList MTopBand multi top-k query
102	Query Expansion : en jämförande studie av Automatisk Query Expansion med och utan relevans-feedback / Query Expansion : a comparative study of Automatic Query Expansion with and without relevance feedback Ekberg-Selander, Karin, Enberg, Johanna January 2007 (has links) In query expansion (QE) terms are added to an initial query in order to improve retrieval effectiveness. In this thesis we use QE in the sense that a reformulation of the query is done by deleting the terms in the initial query and instead replacing them with terms from the documents retrieved in the initial run. The aim of this thesis is to, in a experimental full text invironment, study and compare the retrieval result of two different query expansion strategies in relation to each other. The following questions are addressed by the study: How do the two strategies perform in relation to each other regarding recall?What may be causing the result?Are the two strategies retrieving the same relevant documents?Two strategies are designed to simulate a searcher using automatic query expansion (AQE) either with or without relevance feedback. Strategy I is simulating AQE without relevance feedback by taking the top five documents that are retrieved in the initial run and then extracting the top ten most frequently occurring terms in these to create a new query. Correspondingly the Strategy II, is simulating AQE with relevance feedback by taking the top five relevant documents and extracting the top ten terms in these to create a new query. It is concluded that both of the strategies’ retrieval performance was improved for most of the topics. In average Strategy II did achieve 54.63 percent recall compared to Strategy I which did achieve 45.59 percent recall. The two strategies did retrieve different relevant documents for majority of the topics. Hence, it would be reasonable to base a system on both of them. / Uppsatsnivå: D query expansion query reformulation relevance feedback inquery återvinningseffektivitet information retrieval Social Sciences Samhällsvetenskap
103	Efficient and Reliable In-Network Query Processing in Wireless Sensor Networks Malhotra, Baljeet Singh 11 1900 (has links) The Wireless Sensor Networks (WSNs) have emerged as a new paradigm for collecting and processing data from physical environments, such as wild life sanctuaries, large warehouses, and battlefields. Users can access sensor data by issuing queries over the network, e.g., to find what are the 10 highest temperature values in the network. Typically, a WSN operates by constructing a logical topology, such as a spanning tree, built on top of the physical topology of the network. The constructed logical topology is then used to disseminate queries in the network, and also to process and return the results of such queries back to the user. A major challenge in this context is prolonging the network's lifetime that mainly depends on the energy cost of data communication via wireless radios, which is known to be very expensive as compared to the cost of data processing within the network. In this research, we investigate some of the core problems that deal with the different aspects of in-network query processing in WSNs. In that context, we propose an efficient filtering based algorithm for the top-k query processing in WSNs. Through a systematic study of the top-k query processing in WSNs we propose several solutions in this thesis, which are applicable not only to the top-k queries, but also to in-network query processing problems in general. Specifically, we consider broadcasting and convergecasting, which are two basic operations that are required by many in-network query processing solutions. Scheduling broadcasting and convergecasting is another problem that is important for energy efficiency in WSNs. Failure of communication links, which are common in WSNs, is yet another important issue that needs to be addressed. In this research, we take a holistic approach to deal with the above problems while processing the top-k queries in WSNs. To this end, the thesis makes several contributions. In particular, our proposed solutions include new logical topologies, scheduling algorithms, and an overall sophisticated communication framework, which allows to process the top-k queries efficiently and with increased reliability. Extensive simulation studies reveal that our solutions are not only energy efficient, saving up to 50% of the energy cost as compared to the current state-of-the-art solutions, but they are also robust to link failures. Wireless Sensor Networks Data Query Processing Top-k Query Broadcast Convergecast Scheduling Link Failures Failure Recovery
104	Energy-Efficient Data Management in Wireless Sensor Networks Ai, Chunyu 13 July 2010 (has links) Wireless Sensor Networks (WSNs) are deployed widely for various applications. A variety of useful data are generated by these deployments. Since WSNs have limited resources and unreliable communication links, traditional data management techniques are not suitable. Therefore, designing effective data management techniques for WSNs becomes important. In this dissertation, we address three key issues of data management in WSNs. For data collection, a scheme of making some nodes sleep and estimating their values according to the other active nodes’ readings has been proved energy-efficient. For the purpose of improving the precision of estimation, we propose two powerful estimation models, Data Estimation using a Physical Model (DEPM) and Data Estimation using a Statistical Model (DESM). Most of existing data processing approaches of WSNs are real-time. However, historical data of WSNs are also significant for various applications. No previous study has specifically addressed distributed historical data query processing. We propose an Index based Historical Data Query Processing scheme which stores historical data locally and processes queries energy-efficiently by using a distributed index tree. Area query processing is significant for various applications of WSNs. No previous study has specifically addressed this issue. We propose an energy-efficient in-network area query processing scheme. In our scheme, we use an intelligent method (Grid lists) to describe an area, thus reducing the communication cost and dropping useless data as early as possible. With a thorough simulation study, it is shown that our schemes are effective and energy- efficient. Based on the area query processing algorithm, an Intelligent Monitoring System is designed to detect various events and provide real-time and accurate information for escaping, rescuing, and evacuation when a dangerous event happened. Data management Data estimation Historical data query Area query Wireless sensor networks Computer Sciences
105	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
106	Equivalence of Queries with Nested Aggregation DeHaan, David January 2009 (has links) Query equivalence is a fundamental problem within database theory. The correctness of all forms of logical query rewriting—join minimization, view flattening, rewriting over materialized views, various semantic optimizations that exploit schema dependencies, federated query processing and other forms of data integration—requires proving that the final executed query is equivalent to the original user query. Hence, advances in the theory of query equivalence enable advances in query processing and optimization. In this thesis we address the problem of deciding query equivalence between conjunctive SQL queries containing aggregation operators that may be nested. Our focus is on understanding the interaction between nested aggregation operators and the other parts of the query body, and so we model aggregation functions simply as abstract collection constructors. Hence, the precise language that we study is a conjunctive algebraic language that constructs complex objects from databases of flat relations. Using an encoding of complex objects as flat relations, we reduce the query equivalence problem for this algebraic language to deciding equivalence between relational encodings output by traditional conjunctive queries (not containing aggregation). This encoding-equivalence cleanly unifies and generalizes previous results for deciding equivalence of conjunctive queries evaluated under various processing semantics. As part of our study of aggregation operators that can construct empty sub-collections—so-called “scalar” aggregation—we consider query equivalence for conjunctive queries extended with a left outer join operator, a very practical class of queries for which the general equivalence problem has never before been analyzed. Although we do not completely solve the equivalence problem for queries with outer joins or with scalar aggregation, we do propose useful sufficient conditions that generalize previously known results for restricted classes of queries. Overall, this thesis offers new insight into the fundamental principles governing the behaviour of nested aggregation. database query optimization query equivalence aggregation set semantics bag semantics Computer Science
107	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
108	Equivalence of Queries with Nested Aggregation DeHaan, David January 2009 (has links) Query equivalence is a fundamental problem within database theory. The correctness of all forms of logical query rewriting—join minimization, view flattening, rewriting over materialized views, various semantic optimizations that exploit schema dependencies, federated query processing and other forms of data integration—requires proving that the final executed query is equivalent to the original user query. Hence, advances in the theory of query equivalence enable advances in query processing and optimization. In this thesis we address the problem of deciding query equivalence between conjunctive SQL queries containing aggregation operators that may be nested. Our focus is on understanding the interaction between nested aggregation operators and the other parts of the query body, and so we model aggregation functions simply as abstract collection constructors. Hence, the precise language that we study is a conjunctive algebraic language that constructs complex objects from databases of flat relations. Using an encoding of complex objects as flat relations, we reduce the query equivalence problem for this algebraic language to deciding equivalence between relational encodings output by traditional conjunctive queries (not containing aggregation). This encoding-equivalence cleanly unifies and generalizes previous results for deciding equivalence of conjunctive queries evaluated under various processing semantics. As part of our study of aggregation operators that can construct empty sub-collections—so-called “scalar” aggregation—we consider query equivalence for conjunctive queries extended with a left outer join operator, a very practical class of queries for which the general equivalence problem has never before been analyzed. Although we do not completely solve the equivalence problem for queries with outer joins or with scalar aggregation, we do propose useful sufficient conditions that generalize previously known results for restricted classes of queries. Overall, this thesis offers new insight into the fundamental principles governing the behaviour of nested aggregation. database query optimization query equivalence aggregation set semantics bag semantics Computer Science
109	An Adjustable Expanded Index for Predictive Queries of Moving Objects Chang, Fang-Ming 13 July 2007 (has links) With the development of wireless communications and mobile computing technologies, the applications of moving objects have been developed in many topics, for example, traffic monitoring, mobile E-Commerce, Navigation System, and Geographic Information System. The feature of the moving objects is that objects change their locations continuously. Conventional spatial databases can not support to store the moving objects efficiently, because the databases must be updated frequently. Therefore, it is important to index moving objects for efficiently answering queries about moving objects. Among the spatial indexing methods for predicting current and future data, the approach of parametric spatial access methods has been applied largely, since it needs little memory space to preserve parametric rectangles, and it still provides good performance, so it is adopted generally. The methods of this approach include the TPR-tree, the TPR*-tree, the Bx-tree, and the Bxr-tree. Among those methods, the Bxr-tree improves CPU performance of TPR-tree by expanding query region first, and improves I/O performance of the Bxr-tree by expanding the data blocks additionally. However, the query process of the B$^x_r$-tree is too rough such that it costs too much CPU and I/O time to check the useless data. Therefore, in this thesis, we propose a new data structure and a new query processing method named Adjustable Expanded Index (AEI), to improve the disadvantages of the Bxr-tree. In our method, we let each block records the maximum and minimum speeds of each of eight directions, instead of only the maximum speed of each of four directions in the Bxr-tree method. Based on the data structure, the query region can be expanded in each of eight directions individually, instead of being expanded in each of four directions once in the Bxr-tree method. Moreover, in our AEI method, the data blocks can be expanded according to the direction toward the query region, instead of being expanded in four directions in the Bxr-tree method. In this way, the query process of AEI checks less number of data blocks because it considers the minimum speed of each of eight directions. Furthermore, the objects are divided into four groups in AEI according to their directions, while the Bxr-tree method does not. Only the objects moving to query region will be checked in the query process of AEI. Therefore, we can reduce more number of retrieved data blocks and the number of I/O operations in our method than the Bxr-tree. From our simulation, we show that the query process of the AEI method is more efficient than that of the Bxr-tree in term of the average numbers of retrieved data blocks and I/O operations. predictive query moving object spatio-temporal database the expansion of the query region the expansion of data block
110	A Query Language and Its Processing for Time-Series Document Clusters Khy, Sophoin, Ishikawa, Yoshiharu, Kitagawa, Hiroyuki 12 1900 (has links) No description available. cluster graph cluster transition clustering result graphquery query language query processing transition pattern

Search results