Global ETD Search

11	Aproximaciones eficientes de consultas conjuntivas Romero Orth, Miguel January 2012 (has links) Cuando encontrar la respuesta exacta a una consulta sobre una base de datos muy grande es intratable, es natural aproximar la consulta por otra más eficiente que pertenezca a una clase con buenas cotas en la complejidad de evaluación de consultas. En esta tesis estudiamos tales aproximaciones para consultas conjuntivas. Estas consultas son de especial interés en base de datos, y además sabemos muy bien qué clases de consultas admiten una evaluación eficiente, como las consultas acíclicas, o las de (hyper)treewidth acotado. Definimos una aproximación a una consulta Q como una consulta de una de esas clases que discrepa con Q lo menos posible. Nos concentramos en aproximaciones que siempre entregan respuestas correctas. Probamos que para las clases tratables de consultas conjuntivas mencionadas anteriormente, siempre existen aproximaciones y sus tamaños son a lo más polinomiales en el tamaño de la consulta original. Esto se sigue de resultados generales obtenidos que relacionan propiedades de clausura de clases de consultas conjuntivas con la existencia de aproximaciones. Además, probamos que en muchos casos el tamaño de la aproximación es a lo más el tamaño de la consulta original. Presentamos una serie de resultados sobre cómo ciertas propiedades combinatoriales de las consultas afectan a sus aproximaciones y estudiamos cotas en la cantidad de aproximaciones, al igual que la complejidad de encontrar e identificar aproximaciones. Finalmente, consideramos aproximaciones que entregan todas las respuestas correctas y estudiamos sus propiedades. Minería de datos Homomorfismos (Matemáticas) Conjuctive query Query approximation
12	Search Term Selection and Document Clustering for Query Suggestion Zhang, Xiaomin 06 1900 (has links) In order to improve a user's query and help the user quickly satisfy his/her information need, most search engines provide query suggestions that are meant to be relevant alternatives to the user's query. This thesis builds on the query suggestion system and evaluation methodology described in Shen Jiang's Masters thesis (2008). Jiang's system constructs query suggestions by searching for lexical aliases of web documents and then applying query search to the lexical aliases. A lexical alias for a web document is a list of terms that return the web document in a top-ranked position. Query search is a search process that finds useful combinations of search terms. The main focus of this thesis is to supply alternatives for the components of Jiang's system. We suggest three term scoring mechanisms and generalize Jiang's lexical alias search to be a general search for terms that are useful for constructing good query suggestions. We also replace Jiang's top-down query search by a bottom-up beam search method. We experimentally show that our query suggestion method improves Jiang's system by 30% for short queries and 90% for long queries using Jiang's evaluation method. In addition, we add new evidence supporting Jiang's conclusion that terms in the user's initial query terms are important to include in the query suggestions. In addition, we explore the usefulness of document clustering in creating query suggestions. Our experimental results are the opposite of what we expected: query suggestion based on clustering does not perform nearly as well, in terms of the "coverage" scores we are using for evaluation, as our best method that is not based on document clustering.
13	A product retrieval system robust to subjective queries Matsubara, Shigeki, Sugiki, Kenji January 2008 (has links) No description available. subjective query naturallanguage query customer product reviews product retrieval system
14	Distributed XML Query Processing Kling, Patrick January 2012 (has links) While centralized query processing over collections of XML data stored at a single site is a well understood problem, centralized query evaluation techniques are inherently limited in their scalability when presented with large collections (or a single, large document) and heavy query workloads. In the context of relational query processing, similar scalability challenges have been overcome by partitioning data collections, distributing them across the sites of a distributed system, and then evaluating queries in a distributed fashion, usually in a way that ensures locality between (sub-)queries and their relevant data. This thesis presents a suite of query evaluation techniques for XML data that follow a similar approach to address the scalability problems encountered by XML query evaluation. Due to the significant differences in data and query models between relational and XML query processing, it is not possible to directly apply distributed query evaluation techniques designed for relational data to the XML scenario. Instead, new distributed query evaluation techniques need to be developed. Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query processing is proposed. Based on a data partitioning model that supports both horizontal and vertical fragmentation steps (or any combination of the two), XML collections are fragmented and distributed across the sites of a distributed system. Then, a suite of distributed query evaluation strategies is proposed. These query evaluation techniques ensure locality between each fragment of the collection and the parts of the query corresponding to the data in this fragment. Special attention is paid to scalability and query performance, which is achieved by ensuring a high degree of parallelism during distributed query evaluation and by avoiding access to irrelevant portions of the data. For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides several alternative approaches for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is necessary to predict and compare the expected performance of each of these alternatives. In this work, this is accomplished through a query optimization technique based on a distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is fragmented to the demands of the query workload evaluated over this collection. To evaluate the performance impact of the distributed query evaluation techniques proposed in this thesis, the techniques were implemented within a production-quality XML database system. Based on this implementation, a thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation techniques introduced here lead to significant improvements in query performance and scalability both when compared to centralized techniques and when compared to existing distributed query evaluation techniques. distributed query processing XML query processing Computer Science
15	Search Term Selection and Document Clustering for Query Suggestion Zhang, Xiaomin Unknown Date No description available.
16	Distributed XML Query Processing Kling, Patrick January 2012 (has links) While centralized query processing over collections of XML data stored at a single site is a well understood problem, centralized query evaluation techniques are inherently limited in their scalability when presented with large collections (or a single, large document) and heavy query workloads. In the context of relational query processing, similar scalability challenges have been overcome by partitioning data collections, distributing them across the sites of a distributed system, and then evaluating queries in a distributed fashion, usually in a way that ensures locality between (sub-)queries and their relevant data. This thesis presents a suite of query evaluation techniques for XML data that follow a similar approach to address the scalability problems encountered by XML query evaluation. Due to the significant differences in data and query models between relational and XML query processing, it is not possible to directly apply distributed query evaluation techniques designed for relational data to the XML scenario. Instead, new distributed query evaluation techniques need to be developed. Thus, in this thesis, an end-to-end solution to the scalability problems encountered by XML query processing is proposed. Based on a data partitioning model that supports both horizontal and vertical fragmentation steps (or any combination of the two), XML collections are fragmented and distributed across the sites of a distributed system. Then, a suite of distributed query evaluation strategies is proposed. These query evaluation techniques ensure locality between each fragment of the collection and the parts of the query corresponding to the data in this fragment. Special attention is paid to scalability and query performance, which is achieved by ensuring a high degree of parallelism during distributed query evaluation and by avoiding access to irrelevant portions of the data. For maximum flexibility, the suite of distributed query evaluation techniques proposed in this thesis provides several alternative approaches for evaluating a given query over a given distributed collection. Thus, to achieve the best performance, it is necessary to predict and compare the expected performance of each of these alternatives. In this work, this is accomplished through a query optimization technique based on a distribution-aware cost model. The same cost model is also used to fine-tune the way a collection is fragmented to the demands of the query workload evaluated over this collection. To evaluate the performance impact of the distributed query evaluation techniques proposed in this thesis, the techniques were implemented within a production-quality XML database system. Based on this implementation, a thorough experimental evaluation was performed. The results of this evaluation confirm that the distributed query evaluation techniques introduced here lead to significant improvements in query performance and scalability both when compared to centralized techniques and when compared to existing distributed query evaluation techniques. distributed query processing XML query processing Computer Science
17	Holistic Boolean Twig Pattern Matching for Efficient XML Query Processing Ding, Dabin 01 May 2014 (has links) Efficient twig pattern matching is essential to XML queries and other tree-based queries. Numerous so-called holistic algorithms have been proposed for efficiently processing the twig patterns in XML queries. However, a more general form of twig pattern, called Boolean-twig (or B-twig for short), which allows arbitrary combination of an arbitrary number of all the three logical connectives, AND, OR, and NOT, in a twig pattern, has not been adequately addressed. The theme of this study is on holistic (and efficient) B-twig pattern matching using region encoding and Dewey encoding schemes. We first adopt region encoding and propose a novel, direct approach called DBTwigMerge for holistic B-twig pattern matching, which although enjoys certain theoretical ``beauty'' and ``elegance'' but does not always outperform our prior approach, BTwigMerge. Based on the experience gained and in-depth investigation, we then come up with another new and more efficient approach, FBTwigMerge, which is proven to be the overall winner among all the holistic approaches using region encoding. In this study, we also studied the holistic B-twig pattern matching problem using Dewey encoding. The unique properties of Dewey encoding bring challenges and also benefits to this problem. By carefully addressing the challenges, this dissertation finally presents the first Dewey based holistic approach, called DeweyNOT, for efficiently solving the pattern matching problem with a subclass of B-twigs, i.e., twig queries involving arbitrary AND/NOT predicates. Extensive experimental studies have been conducted that demonstrate the viability and outstanding performance of the proposed approaches. B-twig holistic pattern matching twig query XML query processing
18	Resilient sensor network query processing Stokes, Alan Barry January 2014 (has links) Sensor networks comprise of a collection of resource-constrained, low cost, sometimes fragile wireless motes which have the capability to gather information about their surroundings through the use of sensors, and can be conceived as a distributed computing platform for applications ranging from event detection to environmental monitoring. A Sensor Network Query Processor (SNQP) is a means of collecting data from sensor networks where the requirements are defined using a declarative query language with a set of Quality of Service (QoS) expectations. As sensor networks are often deployed in hostile environments, there is a high possibility that the motes could break or that the communication links between the motes become unreliable. SNQP Query Execution Plans (QEPs) are often optimised for a specific network deployment and are designed to be as energy efficient as possible whilst ensuring the QEPs meet the QoS expectations, yet little has been done for handling the situation where the deployment itself has changed since the optimisation in such a way as to make the original QEP no longer efficient, or unable to operate. In this respect, the previous work on SNQPs has not aimed at being resilient to failures in the assumptions used at compilation/optimisation time which result in a QEP terminating earlier than expected. This dissertation presents a collection of approaches that embed resilience into a SNQP generated QEPs in such a way that a QEP operates for longer whilst still meeting the QoS expectations demanded of it, thereby resulting in a more reliable platform that can be applicable to a broader range of applications. The research contributions reported here include (a) a strategy designed to adapt to predictable node failures due to energy depletion; (b) a collection of strategies designed to adapt to unpredictable node failures; (c) a strategy designed to handle unreliable communication channels; and (d) an empirical evaluation to show the benefits of a resilient SNQP in relation to a representative non-resilient SNQP. 621.382
19	A Hybrid Cost Model for Evaluating Query Execution Plans Wang, Ning 22 January 2024 (has links) Query optimization aims to select a query execution plan among all query paths for a given query. The query optimization of traditional relational database management systems (RDBMSs) relies on estimating the cost of the alternative query plans in the query plan search space provided by a cost model. The classic cost model (CCM) may lead the optimizer to choose query plans with poor execution time due to inaccurate cardinality estimations and simplifying assumptions. A learned cost model (LCM) based on machine learning does not rely on such estimations and learns the cost from runtime. While learned cost models are shown to improve the average performance, they may not guarantee that optimal performance will be consistently achieved. In addition, the query plans generated using the LCM may not necessarily outperform the query plans generated with the CCM. This thesis proposes a hybrid approach to solve this problem by striking a balance between the LCM and the CCM. The hybrid model uses the LCM when it is expected to be reliable in selecting a good plan and falls back to the CCM otherwise. The evaluation results of the hybrid model demonstrate promising performance, indicating potential for successful use in future applications. Query optimization Hybrid cost model Learned cost model Query classifier
20	BINDING HASH TECHNIQUE FOR XML QUERY OPTIMIZATION BRANT, MICHAEL J. 20 July 2006 (has links) No description available. XML Query Processing XML Query Optimization Semi-structured data XPath

Search results