Global ETD Search

141	Robust Complex Event Pattern Detection over Streams Li, Ming 04 April 2010 (has links) Event stream processing (ESP) has become increasingly important in modern applications. In this dissertation, I focus on providing a robust ESP solution by meeting three major research challenges regarding the robustness of ESP systems: (1) while event constraint of the input stream is available, applying such semantic information in the event processing; (2) handling event streams with out-of-order data arrival and (3) handling event streams with interval-based temporal semantics. The following are the three corresponding research tasks completed by the dissertation: Task I - Constraint-Aware Complex Event Pattern Detection over Streams. In this task, a framework for constraint-aware pattern detection over event streams is designed, which on the fly checks the query satisfiability / unsatisfiability using a lightweight reasoning mechanism and adjusts the processing strategy dynamically by producing early feedback, releasing unnecessary system resources and terminating corresponding pattern monitor. Task II - Complex Event Pattern Detection over Streams with Out-of-Order Data Arrival. In this task, a mechanism to address the problem of processing event queries specified over streams that may contain out-of-order data is studied, which provides new physical implementation strategies for the core stream algebra operators such as sequence scan, pattern construction and negation filtering. Task III - Complex Event Pattern Detection over Streams with Interval-Based Temporal Semantics. In this task, an expressive language to represent the required temporal patterns among streaming interval events is introduced and the corresponding temporal operator ISEQ is designed. event stream constraint database CEP interval pattern detection query processing
142	Query processing in a distributed environment Chao, Han Ying January 2010 (has links) Typescript (photocopy). / Digitized by Kansas Correctional Industries QUERY (Information retrieval system)
143	Querying big data with bounded data access Cao, Yang January 2016 (has links) Query answering over big data is cost-prohibitive. A linear scan of a dataset D may take days with a solid state device if D is of PB size and years if D is of EB size. In other words, polynomial-time (PTIME) algorithms for query evaluation are already not feasible on big data. To tackle this, we propose querying big data with bounded data access, such that the cost of query evaluation is independent of the scale of D. First of all, we propose a class of boundedly evaluable queries. A query Q is boundedly evaluable under a set A of access constraints if for any dataset D that satisfies constraints in A, there exists a subset DQ ⊆ D such that (a) Q(DQ) = Q(D), and (b) the time for identifying DQ from D, and hence the size \|DQ\| of DQ, are independent of \|D\|. That is, we can compute Q(D) by accessing a bounded amount of data no matter how big D grows.We study the problem of deciding whether a query is boundedly evaluable under A. It is known that the problem is undecidable for FO without access constraints. We show that, in the presence of access constraints, it is decidable in 2EXPSPACE for positive fragments of FO queries, but is already EXPSPACE-hard even for CQ. To handle the undecidability and high complexity of the analysis, we develop effective syntax for boundedly evaluable queries under A, referred to as queries covered by A, such that, (a) any boundedly evaluable query under A is equivalent to a query covered by A, (b) each covered query is boundedly evaluable, and (c) it is efficient to decide whether Q is covered by A. On top of DBMS, we develop practical algorithms for checking whether queries are covered by A, and generating bounded plans if so. For queries that are not boundedly evaluable, we extend bounded evaluability to resource-bounded approximation and bounded query rewriting using views. (1) Resource-bounded approximation is parameterized with a resource ratio a ∈ (0,1], such that for any query Q and dataset D, it computes approximate answers with an accuracy bound h by accessing at most a\|D\| tuples. It is based on extended access constraints and a new accuracy measure. (2) Bounded query rewriting tackles the problem by incorporating bounded evaluability with views, such that the queries can be exactly answered by accessing cached views and a bounded amount of data in D. We study the problem of deciding whether a query has a bounded rewriting, establish its complexity bounds, and develop effective syntax for FO queries with a bounded rewriting. Finally, we extend bounded evaluability to graph pattern queries, by extending access constraints to graph data. We characterize bounded evaluability for subgraph and simulation patterns and develop practical algorithms for associated problems. 005.7
144	En komparativ studie av fem rankningsalgoritmer för query expansion / A comparative study of five ranking algorithms for query expansion Eklund, Johan, Stenström, Anders January 2002 (has links) The purpose of this thesis is to compare five different ranking algorithms for query expansion. The algorithms compared are f4, f4mod, porter, wpq, and emim. This is done using a TREC collection, a selection of topics, and relevance judgements. Relative recall is measured before and after the expansion of the query. The study shows that all of the algorithms manage to increase the relative recall, f4 being the one most successful. / Uppsatsnivå: D query expansion ir-system återvinningseffektivitet rankningsalgoritmer testkollektioner Social Sciences Samhällsvetenskap
145	Knowledge driven approaches to e-learning recommendation Mbipom, Blessing January 2018 (has links) Learners often have difficulty finding and retrieving relevant learning materials to support their learning goals because of two main challenges. The vocabulary learners use to describe their goals is different from that used by domain experts in teaching materials. This challenge causes a semantic gap. Learners lack sufficient knowledge about the domain they are trying to learn about, so are unable to assemble effective keywords that identify what they wish to learn. This problem presents an intent gap. The work presented in this thesis focuses on addressing the semantic and intent gaps that learners face during an e-Learning recommendation task. The semantic gap is addressed by introducing a method that automatically creates background knowledge in the form of a set of rich learning-focused concepts related to the selected learning domain. The knowledge of teaching experts contained in e-Books is used as a guide to identify important domain concepts. The concepts represent important topics that learners should be interested in. An approach is developed which leverages the concept vocabulary for representing learning materials and this influences retrieval during the recommendation of new learning materials. The effectiveness of our approach is evaluated on a dataset of Machine Learning and Data Mining papers, and our approach outperforms benchmark methods. The results confirm that incorporating background knowledge into the representation of learning materials provides a shared vocabulary for experts and learners, and this enables the recommendation of relevant materials. We address the intent gap by developing an approach which leverages the background knowledge to identify important learning concepts that are employed for refining learners' queries. This approach enables us to automatically identify concepts that are similar to queries, and take advantage of distinctive concept terms for refining learners' queries. Using the refined query allows the search to focus on documents that contain topics which are relevant to the learner. An e-Learning recommender system is developed to evaluate the success of our approach using a collection of learner queries and a dataset of Machine Learning and Data Mining learning materials. Users with different levels of expertise are employed for the evaluation. Results from experts, competent users and beginners all showed that using our method produced documents that were consistently more relevant to learners than when the standard method was used. The results show the benefits in using our knowledge driven approaches to help learners find relevant learning materials.
146	Query processing in Chiql: optimization and translation. January 1997 (has links) by Yip Suen-man. / Appendixes in Chinese and English. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1997. / Includes bibliographical references. / Acknowledgment --- p.1 / Abstract --- p.2 / Table of Contents --- p.3 / List of Tables --- p.5 / List of Figures --- p.6 / Chapter Chapter 1 --- Introduction --- p.7 / Chapter 1.1 --- Objectives --- p.9 / Chapter 1.2 --- Chapter Summary --- p.10 / Chapter Chapter 2 --- Related Work --- p.11 / Chapter 2.1 --- Relational Query Language --- p.11 / Chapter 2.1.1 --- Relational Algebra Vs Relational Calculus --- p.11 / Chapter 2.1.2 --- Procedural Vs Nonprocedural --- p.13 / Chapter 2.1.3 --- Natural Language (NL) Vs Restricted Natural Language (RNL) --- p.13 / Chapter 2.2 --- Existing Relational Query Language --- p.14 / Chapter 2.3 --- Chinese Related Work --- p.16 / Chapter 2.4 --- Chapter Summary --- p.17 / Chapter Chapter 3 --- Chinese Database Query Language : Chiql --- p.19 / Chapter 3.1 --- Naturalness --- p.19 / Chapter 3.2 --- Simplicity --- p.20 / Chapter 3.3 --- Procedural and Multi-statements Query Style --- p.21 / Chapter 3.4 --- Functional Completeness --- p.22 / Chapter 3.5 --- Chapter Summary --- p.25 / Chapter Chapter 4 --- Query Processing --- p.26 / Chapter 4.1 --- Query Optimization --- p.27 / Chapter 4.1.1 --- Query Representation --- p.27 / Chapter 4.1.2 --- Standardization --- p.28 / Chapter 4.1.3 --- Simplification --- p.29 / Chapter 4.1.4 --- Amelioration --- p.29 / Chapter 4.2 --- Query Translation of SQL --- p.29 / Chapter 4.3 --- Query Processing in Chiql --- p.33 / Chapter 4.3.1 --- Overview of the Query Processing --- p.33 / Chapter 4.3.2 --- Inter-Statement Dependency --- p.34 / Chapter 4.3.3 --- Translation flow of Chiql-to-SQL --- p.36 / Chapter 4.3.4 --- An Introductory Example --- p.37 / Chapter 4.4 --- Chapter Summary --- p.40 / Chapter Chapter 5 --- Statement Merging Algorithm (SMA) --- p.41 / Chapter 5.1 --- Problems --- p.41 / Chapter 5.2 --- Definitions --- p.42 / Chapter 5.3 --- Linear Merging Algorithm (LMA) --- p.43 / Chapter 5.4 --- Tree Merging Algorithm (TMA) --- p.47 / Chapter 5.5 --- Statement Merging Algorithm (SMA) --- p.50 / Chapter 5.6 --- Improvement --- p.56 / Chapter 5.7 --- Chapter Summary --- p.57 / Chapter Chapter 6 --- Pattern Mapping Algorithm (PMA) --- p.58 / Chapter 6.1 --- Problem --- p.58 / Chapter 6.2 --- Type of Patterns --- p.61 / Chapter 6.3 --- Pre-requisite of Pattern Mapping --- p.65 / Chapter 6.4 --- Pattern Mapping Algorithm (PMA) --- p.65 / Chapter 6.5 --- An Illustration Example --- p.68 / Chapter 6.6 --- Chapter Summary --- p.72 / Chapter Chapter 7 --- Evaluation --- p.73 / Chapter 7.1 --- Testing the Correctness --- p.73 / Chapter 7.2 --- Comparison in Translation Power With Other Translator --- p.76 / Chapter 7.3 --- Chapter Summary --- p.78 / Chapter Chapter 8 --- Conclusion --- p.79 / Reference --- p.82 / Appendix --- p.86 Chinese language--Data processing Query languages (Computer science) Database searching
147	Techniques d'optimisation pour des données semi-structurées du web sémantique / Database techniques for semantics-rich semi-structured Web data Leblay, Julien 27 September 2013 (has links) RDF et SPARQL se sont imposés comme modèle de données et langage de requêtes standard pour décrire et interroger les données sur la Toile. D’importantes quantités de données RDF sont désormais disponibles, sous forme de jeux de données ou de méta-données pour des documents semi-structurés, en particulier XML. La coexistence et l’interdépendance grandissantes entre RDF et XML rendent de plus en plus pressant le besoin de représenter et interroger ces données conjointement. Bien que de nombreux travaux couvrent la production et la publication, manuelles ou automatiques, d’annotations pour données semi-structurées, peu de recherches ont été consacrées à l’exploitation de telles données. Cette thèse pose les bases de la gestion de données hybrides XML-RDF. Nous présentons XR, un modèle de données accommodant l’aspect structurel d’XML et la sémantique de RDF. Le modèle est suffisamment général pour représenter des données indépendantes ou interconnectées, pour lesquelles chaque nœud XML est potentiellement une ressource RDF. Nous introduisons le langage XRQ, qui combine les principales caractéristiques des langages XQuery et SPARQL. Le langage permet d’interroger la structure des documents ainsi que la sémantique de leurs annotations, mais aussi de produire des données semi-structurées annotées. Nous introduisons le problème de composition de requêtes dans le langage XRQ et étudions de manière exhaustive les techniques d’évaluation de requêtes possibles. Nous avons développé la plateforme XRP, implantant les algorithmes d’évaluation de requêtes dont nous comparons les performances expérimentalement. Nous présentons une application reposant sur cette plateforme pour l’annotation automatique et manuelle de pages trouvées sur la Toile. Enfin, nous présentons une technique pour l’inférence RDFS dans les systèmes de gestion de données RDF (et par extension XR). / Since the beginning of the Semantic Web, RDF and SPARQL have become the standard data model and query language to describe resources on the Web. Large amounts of RDF data are now available either as stand-alone datasets or as metadata over semi-structured documents, typically XML. The ability to apply RDF annotations over XML data emphasizes the need to represent and query data and metadata simultaneously. While significant efforts have been invested into producing and publishing annotations manually or automatically, little attention has been devoted to exploiting such data. This thesis aims at setting database foundations for the management of hybrid XML-RDF data. We present a data model capturing the structural aspects of XML data and the semantics of RDF. Our model is general enough to describe pure XML or RDF datasets, as well as RDF-annotated XML data, where any XML node can act as a resource. We also introduce the XRQ query language that combines features of both XQuery and SPARQL. XRQ not only allows querying the structure of documents and the semantics of their annotations, but also producing annotated semi-structured data on-the-fly. We introduce the problem of query composition in XRQ, and exhaustively study query evaluation techniques for XR data to demonstrate the feasibility of this data management setting. We have developed an XR platform on top of well-known data management systems for XML and RDF. The platform features several query processing algorithms, whose performance is experimentally compared. We present an application built on top of the XR platform. The application provides manual and automatic annotation tools, and an interface to query annotated Web page and publicly available XML and RDF datasets concurrently. As a generalization of RDF and SPARQL, XR and XRQ enables RDFS-type of query answering. In this respect, we present a technique to support RDFS-entailments in RDF (and by extension XR) data management systems. Web sémantique XML RDF Linked Data Modèles de données Langages de requêtes Composition de requêtes Réponse aux requêtes Optimisation de requêtes Semantic Web XML RDF Linked Data Data models Query languages Query composition Query answering Query optimization
148	Load-balanced Range Query Workload Partitioning for Compressed Spatial Hierarchical Bitmap (cSHB) Indexes January 2018 (has links) abstract: The spatial databases are used to store geometric objects such as points, lines, polygons. Querying such complex spatial objects becomes a challenging task. Index structures are used to improve the lookup performance of the stored objects in the databases, but traditional index structures cannot perform well in case of spatial databases. A significant amount of research is made to ingest, index and query the spatial objects based on different types of spatial queries, such as range, nearest neighbor, and join queries. Compressed Spatial Bitmap Index (cSHB) structure is one such example of indexing and querying approach that supports spatial range query workloads (set of queries). cSHB indexes and many other approaches lack parallel computation. The massive amount of spatial data requires a lot of computation and traditional methods are insufficient to address these issues. Other existing parallel processing approaches lack in load-balancing of parallel tasks which leads to resource overloading bottlenecks. In this thesis, I propose novel spatial partitioning techniques, Max Containment Clustering and Max Containment Clustering with Separation, to create load-balanced partitions of a range query workload. Each partition takes a similar amount of time to process the spatial queries and reduces the response latency by minimizing the disk access cost and optimizing the bitmap operations. The partitions created are processed in parallel using cSHB indexes. The proposed techniques utilize the block-based organization of bitmaps in the cSHB index and improve the performance of the cSHB index for processing a range query workload. / Dissertation/Thesis / Masters Thesis Computer Science 2018 Computer science partitioning range query spatial index spatial queries
149	Efficient Query Expansion Billerbeck, Bodo, bodob@cs.rmit.edu.au January 2006 (has links) Hundreds of millions of users each day search the web and other repositories to meet their information needs. However, queries can fail to find documents due to a mismatch in terminology. Query expansion seeks to address this problem by automatically adding terms from highly ranked documents to the query. While query expansion has been shown to be effective at improving query performance, the gain in effectiveness comes at a cost: expansion is slow and resource-intensive. Current techniques for query expansion use fixed values for key parameters, determined by tuning on test collections. We show that these parameters may not be generally applicable, and, more significantly, that the assumption that the same parameter settings can be used for all queries is invalid. Using detailed experiments, we demonstrate that new methods for choosing parameters must be found. In conventional approaches to query expansion, the additional terms are selected from highly ranked documents returned from an initial retrieval run. We demonstrate a new method of obtaining expansion terms, based on past user queries that are associated with documents in the collection. The most effective query expansion methods rely on costly retrieval and processing of feedback documents. We explore alternative methods for reducing query-evaluation costs, and propose a new method based on keeping a brief summary of each document in memory. This method allows query expansion to proceed three times faster than previously, while approximating the effectiveness of standard expansion. We investigate the use of document expansion, in which documents are augmented with related terms extracted from the corpus during indexing, as an alternative to query expansion. The overheads at query time are small. We propose and explore a range of corpus-based document expansion techniques and compare them to corpus-based query expansion on TREC data. These experiments show that document expansion delivers at best limited benefits, while query expansion ï¿½ including standard techniques and efficient approaches described in recent work ï¿½ usually delivers good gains. We conclude that document expansion is unpromising, but it is likely that the efficiency of query expansion can be further improved. information retrieval query expansion pseudo relevance feedback efficiency
150	Efficient Query Expansion Billerbeck, Bodo, bodob@cs.rmit.edu.au January 2006 (has links) Hundreds of millions of users each day search the web and other repositories to meet their information needs. However, queries can fail to find documents due to a mismatch in terminology. Query expansion seeks to address this problem by automatically adding terms from highly ranked documents to the query. While query expansion has been shown to be effective at improving query performance, the gain in effectiveness comes at a cost: expansion is slow and resource-intensive. Current techniques for query expansion use fixed values for key parameters, determined by tuning on test collections. We show that these parameters may not be generally applicable, and, more significantly, that the assumption that the same parameter settings can be used for all queries is invalid. Using detailed experiments, we demonstrate that new methods for choosing parameters must be found. In conventional approaches to query expansion, the additional terms are selected from highly ranked do cuments returned from an initial retrieval run. We demonstrate a new method of obtaining expansion terms, based on past user queries that are associated with documents in the collection. The most effective query expansion methods rely on costly retrieval and processing of feedback documents. We explore alternative methods for reducing query-evaluation costs, and propose a new method based on keeping a brief summary of each document in memory. This method allows query expansion to proceed three times faster than previously, while approximating the effectiveness of standard expansion. We investigate the use of document expansion, in which documents are augmented with related terms extracted from the corpus during indexing, as an alternative to query expansion. The overheads at query time are small. We propose and explore a range of corpus-based document expansion techniques and compare them to corpus-based query expansion on TREC data. These experiments show that document expansion delivers at best limited ben efits, while query expansion, including standard techniques and efficient approaches described in recent work, usually delivers good gains. We conclude that document expansion is unpromising, but it is likely that the efficiency of query expansion can be further improved. information retrieval query expansion pseudo relevance feedback efficiency

Search results