Global ETD Search

41	Database and query analysis tools for MySQL exploiting hypertree and hypergraph decompositions / Chokkalingam, Selvameenal. January 2006 (has links) Thesis (M.S.)--Ohio University, November, 2006. / Title from PDF t.p. Includes bibliographical references.
42	Towards Spatial Queries over Phenomena in Sensor Networks Jin, Guang January 2009 (has links) (PDF) No description available. Querying (Computer science) Sensor networks Wireless communication systems
43	Social-aware ridesharing Fu, Xiaoyi 04 December 2019 (has links) In the past few years, ridesharing has been becoming increasingly popular in urban areas worldwide for its low cost and environment friendliness. In a typical scenario, the ridesharing service provider matches drivers of private vehicles or taxis to those seeking local taxicab- like transportation. Much research attention has been drawn to the optimization of travel costs in shared rides. However, other important factors in ridesharing, such as the social comfort, trust issues and revenue, have not been fully considered in the existing works. Social-aware ridesharing, which makes use of social relations among drivers and riders to address safety issues, and dynamic pricing, which dynamically determines shared ride fares, are two active research directions with important business implications. In this dissertation, we take the first step to comprehensively investigate the social-aware ridesharing queries. First, we study the problem of the top-k social-aware taxi ridesharing query. In particular, upon receiving a user's trip request, the service ranks feasible taxis in a way that integrates detour in time and passengers' cohesion in social distance. We propose a new system framework to support such a social-aware taxi-sharing service. It provides two methods for selecting candidate taxis for a given trip request. The grid-based method quickly goes through available taxis and returns a relatively larger candidate set, whereas the edge-based method takes more time to obtain a smaller candidate set. Furthermore, we design techniques to speed up taxi route scheduling for a given trip request. We propose travel-time based bounds to rule out unqualified cases quickly, as well as algorithms to find feasible cases efficiently. We evaluate our proposals using a real taxi dataset from New York City. Experimental results demonstrate the efficiency and scalability of the proposed taxi recommendation solution in real-time social-aware ridesharing services. Second, we study the problem of efficient matching of offers and requests in social-aware ridesharing. We formulate a new problem, named Assignment of Requests to Offers (ARO), that aims to maximize the number of served riders while satisfying the social comfort constraints as well as spatial-temporal constraints. We prove that the ARO problem is NP- hard. We then propose an exact algorithm for a simplified ARO problem. We further propose three pruning strategies to efficiently narrow down the searching space and speed up the assignment processing. Based on these pruning strategies, we develop two novel heuristic algorithms, the request-oriented approach and offer-oriented approach, to tackle the ARO problem. We also study the dynamic ARO problem and present a novel algorithm to tackle this problem. Through extensive experiments, we demonstrate the efficiency and effectiveness of our proposed approaches on real-world datasets. Third, we study the top-k vehicle matching in social ridesharing. In the current ridesharing research, optimizing social cohesion and revenue at the same time has not been well studied. We present a new pricing scheme that better incentivizes drivers and riders to participate in ridesharing, and then propose a novel type of Price-aware Top-k Matching (PTkM) queries which retrieve the top-k vehicles for a rider's request by taking into account both social relations and revenue. We design an efficient algorithm with a set of powerful pruning techniques to tackle this problem. Moreover, we propose a novel index tailored to our problem to further speed up query processing. Extensive experimental results on real datasets show that our proposed algorithms achieve desirable performance for real-world deployment. The work of this thesis shows that the social-aware ridesharing query processing techniques are effective and efficient, which would facilitate ridesharing services in real world.
44	Socio-aware random walk search and replication in peer-to-peer networks Xie, Jing, 謝靜 January 2009 (has links) published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy Querying (Computer science) Database searching.
45	Uncertain data management. / CUHK electronic theses & dissertations collection January 2011 (has links) In this thesis, we explore the issues of uncertain data management in several different aspects. First, we propose a novel linear time algorithm to compute the positional probability, the computation of which is a primitive operator for most of the ranking definitions. Our algorithm is based on the conditional probability formulation of positional probability and the system of linear equations. Based on the formulation of conditional probability, we also prove a tight upper bound of the top-k probability of tuples, which is then used to stop the top-k computation earlier. Second, we study top-k probabilistic ranking queries with joins when scores and probabilities are stored in different relations. We focus on reducing the join cost in probabilistic top-k ranking. We investigate two probabilistic score functions, namely, expected rank value and probability of highest ranking. We give upper/lower bounds of such probabilistic score functions in random access and sequential access, and propose new I/O efficient algorithms to find top-k objects. Third, we extend the possible worlds semantics to probabilistic XML ranking query, which is to rank top-k probabilities of the answers of a twig query in probabilistic XML data. The new challenge is how to compute top-k probabilities of answers of a twig query in probabilistic XML in the presence of containment (ancestor/descendant) relationships. We focus on node queries first, and propose a new dynamic programming algorithm which can compute top-k probabilities for the answers of node queries based on the previously computed results in probabilistic XML data. We further propose optimization techniques to share the computational cost. We also show techniques to support path queries and tree queries. Fourth, we study how to rank documents using a set of keywords, given a context that is associated with the documents. We model the problem using a graph with two different kinds of nodes (document nodes and multi-attribute nodes), where the edges between document nodes and multi-attribute nodes exist with some probability. We discuss its score function, cost function, and ranking with uncertainty. We also propose new algorithms to rank documents that are most related to the user-given keywords by integrating the context information. / Uncertain data management has received a lot of attentions recently due to the fact that data obtained can be incomplete or uncertain in many real applications. Ranking of uncertain data becomes an important research issue, the possible worlds semantics-based ranking makes it different from the ranking of deterministic data. In the traditional deterministic data, we can compute a score for each object, and then the objects are ranked based on the computed scores. However, in the scenario of uncertain data, each object has a probability to be the true answer (or the existence probability), besides the computed score. A probabilistic top-k ranking query ranks objects by the interplay of score and probability based on the possible worlds semantics. Many definitions have been proposed in the literature based on the possible worlds semantics. / Chang, Lijun. / Advisers: Hong Cheng; Jeffrey Xu Yu. / Source: Dissertation Abstracts International, Volume: 73-06, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (leaves 131-139). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [201-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. Database management Querying (Computer science) Ranking and selection (Statistics) Uncertainty (Information theory)
46	Incremental Maintenance Of Materialized XQuery Views El-Sayed, Maged F 23 August 2005 (has links) "Keeping views fresh by maintaining the consistency between materialized views and their base data in the presence of base updates is a critical problem for many applications, including data warehousing and data integration. While heavily studied for traditional databases, the maintenance of XML views remains largely unexplored. Maintaining XML views is complex due to the richness of the XML data model and the powerful capabilities of XML query languages, such as XQuery. This dissertation proposes a comprehensive solution for the general problem of maintaining materialized XQuery views. Our solution is the first to enable the maintenance of a large class of XQuery views including XPath expressions, FLWOR expressions, and Element Constructors. These views may contain arbitrary result construction and arbitrary grouping and join operations. Our solution also supports the unique order requirements of XQuery including source document order and query order. The contributions of this dissertation include: (i) an efficient solution for supporting order in XML query processing and view maintenance, (ii) an identifier-based technique for enabling incremental construction of XML views, (iii) a mechanism for modeling and validating source XML updates, (iv) a counting algorithm for supporting view maintenance on delete and modify updates, (v) an algebraic solution for propagating bulk XML updates, and (vi) an efficient mechanism for refreshing materialized XML views on propagated updates. We provide proofs of correctness of our proposed techniques for materialized XQuery maintenance. We have implemented a prototype of our view maintenance solution on top of the Rainbow XML query engine, developed at WPI. Our experiments confirm that our solution provides a practical and efficient solution for maintaining materialized XQuery views even when handling heterogeneous batches of possibly large source updates. Our solution follows the widely adopted propagate-apply framework for view maintenance common to all mainstream query engines. That is, our solution produces incremental maintenance plans in the same algebraic language used to define the views. These plans can thus be optimized and executed by standard query processing techniques. Being compatible with standard frameworks paves the way for our XML view maintenance solution to be easily adopted by existing database engines." XML XQuery Incremental View Maintenance XML (Document markup language) Querying (Computer science)
47	Updating XML Views Wang, Ling 24 August 2006 (has links) "Update operations over XML views are essential for applications using XML views. In this dissertation work, we provide scalable solutions to support updating through XML views defined over relational databases. Especially we focus on the update-public semantic, where updates are always public (made to the public database), and the update-local semantic, where update effects are first kept local and then made public as and when required. Towards this, we propose the clean extended-source theory for determining whether a correct view update translation exists, which then serves as a theoretical foundation for us to design practical XML view updating algorithms. Under update-public semantic, state-of-the-art view updating work focus on identifying the correct update translation purely on the data. We instead take a schema-centric solution, which utilizes the schema of the underlying source to effectively prune updates that are guaranteed to be not translatable and pass updates that are guaranteed to be translatable directly to the SQL engine. Only those updates that could not be classified using schema knowledge are finally analyzed by examining the data. This required data-level check is further optimized under schema guidance to prune the search space for finding a correct translation. As the first work addressing the update-local semantic, we propose a practical framework, called LoGo. LoGo Localizes the view update translation, while preserves the properties of views being side-effect free and updates being always updatable. LoGo also supports on-demand merging of the local database of the subject viewinto the public database (also called global database), while still guaranteeing the subject view being free of side effects. A flexible synchronization service is provided in LoGo that enables all other views defined over the same public database to be refreshed, i.e., synchronized with the publically committed changes, if so desired. Further, given that XMLis an ordered datamodel,we propose an ordersensitive solution named O-HUX to support XML view updating with order. We have implemented the algorithms, along with respective optimization techniques. Experimental results confirm the effectiveness of the proposed services, and highlight its performance characteristics." View Updating XML XQuery XML (Document markup language) Querying (Computer science)
48	Metadata-Aware Query Processing over Data Streams Ding, Luping 22 April 2008 (has links) Many modern applications need to process queries over potentially infinite data streams to provide answers in real-time. This dissertation proposes novel techniques to optimize CPU and memory utilization in stream processing by exploiting metadata on streaming data or queries. It focuses on four topics: 1) exploiting stream metadata to optimize SPJ query operators via operator configuration, 2) exploiting stream metadata to optimize SPJ query plans via query-rewriting, 3) exploiting workload metadata to optimize parameterized queries via indexing, and 4) exploiting event constraints to optimize event stream processing via run-time early termination. The first part of this dissertation proposes algorithms for one of the most common and expensive query operators, namely join, to at runtime identify and purge no-longer-needed data from the state based on punctuations. Exploitations of the combination of punctuation and commonly-used window constraints are also studied. Extensive experimental evaluations demonstrate both reduction on memory usage and improvements on execution time due to the proposed strategies. The second part proposes herald-driven runtime query plan optimization techniques. We identify four query optimization techniques, design a lightweight algorithm to efficiently detect the optimization opportunities at runtime upon receiving heralds. We propose a novel execution paradigm to support multiple concurrent logical plans by maintaining one physical plan. Extensive experimental study confirms that our techniques significantly reduce query execution times. The third part deals with the shared execution of parameterized queries instantiated from a query template. We design a lightweight index mechanism to provide multiple access paths to data to facilitate a wide range of parameterized queries. To withstand workload fluctuations, we propose an index tuning framework to tune the index configurations in a timely manner. Extensive experimental evaluations demonstrate the effectiveness of the proposed strategies. The last part proposes event query optimization techniques by exploiting event constraints such as exclusiveness or ordering relationships among events extracted from workflows. Significant performance gains are shown to be achieved by our proposed constraint-aware event processing techniques. metadata constraint data stream continuous query optimization Querying (Computer science) Metadata Data processing
49	Semantic Query Optimization for Processing XML Streams with Minimized Memory Footprint Li, Ming 25 August 2007 (has links) "XML streams have become increasingly prevalent in modern applications, ranging from network traffic monitoring to real-time information publishing. XQuery evaluation over XML streams require the temporary buffering of XML elements, which not only utilizes system buffer and CPU resources but also causes un-necessary output latency. This thesis presents a semantic query optimization solution to minimize memory footprint during XQuery evaluation by exploiting XML schema knowledge. In many practical applications, XML streams are generated conforming to pre-defined schema constraints typically expressed via a DTD or an XML schema specification. Utilizing such constraints enables us to on-the-fly predict the non-occurrence of a given pattern within a bound context. This helps us to avoid data buffering and to release buffered data at an earlier moment, thus achieving a minimized memory footprint. In this work, we focus on one particular class of constraints, namely, the Pattern Non-Occurrence (PNO) constraint. We develop an automaton-based technique to detect PNO constraints at runtime. For a given query, optimization opportunities which can be triggered by runtime PNO detection are explored for memory footprint minimization. Optimization decisions are encoded using our proposed Condition-Action Graph (CAG). The optimization-embedded execution strategy is then proposed to execute an optimized plan by detecting PNO constraints at run-time and then triggering the corresponding encoded actions when certain predefined conditions are satisfied. To ensure the efficiency of such PNO-triggered optimization, we propose optimization strategy on shrinking the CAGs by utilizing constraint knowledge during the query plan compiling phase. We implement our optimization technique within the Raindrop XQuery engine. Our system implementation processes XQuery utilizing the Raindrop algebra. It is efficiently augmented by our optimization module, which uses Glushkov automaton technique to capture and monitor PNO constraints in parallel with the query-driven pattern retrieval. Finally, we conduct experimental studies using both real and synthetic data streams to illustrate that our techniques bring significant performance improvement in both memory and CPU usage as well as improved output latency over state-of-the-art solutions, with little overhead." optimization XML query evaluation stream processing database XML (Document markup language) Querying (Computer science) Mathematical optimization
50	Query and mining in large graph databases. January 2013 (has links) 图结构能够描述数据对象之间的复杂关系，因而被广泛应用于多种领域。随着相关应用领域的发展，图数据库的规模变得庞大且仍在不断增长。这给研究者在图查询和图挖掘方面带来新的挑战。本文主要研究以下三个问题：如何确定两个图的顶点对应关系，使得其中一个图的子结构匹配到另一个图的相似子结构；如何从含有多个小图的数据库中，找到与查询图相似的图；如何在由不同类别的图组成的数据库中，选取特征子图并对图进行分类。 / 在本文中，对于第一个问题，我们提出了新的两段式图匹配算法。在第一阶段，我们采用了一个新的启发式策略，能够先选取锚顶点并向外扩展，进而快速得到初始匹配。在第二阶段，我们设计了新的算法对初始匹配加以改进，并且证明了新的匹配优于初始匹配。这个两段式图匹配算法能够快速有效地获得两个图的高质量匹配。为解决第二个问题，我们首先定义一个新的度量以衡量两图间的距离。它基于两图间的最大公共子图，能够很好地捕捉两个图的相同及不同之处。由于最大公共子图的计算是NP完全问题，为了快速回答top-k相似图查询，我们提出了一个高效算法，能够极大地减少最大公共子图的计算次数。这个算法根据距离度量的三种下界进行剪枝以筛选掉不合格的图。其中，前两种下界的计算基于两图的结构信息，第三种下界可由距离度量的三角不等式性质推出。我们还设计了三种不同的索引结构来支持剪枝，它们能够在剪枝效果和索引时间方面达到不同程度的平衡。关于第三个问题，我们发现了目前广泛使用的特征判别函数的两个主要缺陷，并据此提出了一个新的多样性特征判别函数。它不仅能衡量特征的判别性，而且能衡量特征的多样性。我们从多个方面分析了这个函数的性质，发现它能更好地区分不同类别的图。基于这个函数，我们设计了新的特征选取算法，获得很高的分类精度。 / Graph has powerful ability to model complex structural relationships among data objects and has been widely used in various applications. Along with the development of the application domains, graph databases become large and are growing rapidly in size. This brings researchers new challenges on graph query and mining, among which we mainly focus on investigating the following three problems: how to find the correspondence between the nodes of two large graphs so that some substructures in one graph are mapped to similar substructures in the other; another problem is how to retrieve similar graphs for a query graph from a graph database consisting of a large number of graphs; and the last problem is how to extract subgraph features to build an automated classification model for a graph database containing graphs which belong to different classes. / In this thesis, for the first problem, we propose a novel two-step approach which can efficiently match two large graphs over thousands of nodes with high matching quality. In the first stage, we design an anchor-selection/expansion scheme to construct a good initial matching heuristically. In the second stage, we propose a new approach to refine the initial matching and give the optimality of our refinement algorithm. Our approach can produce an approximate matching result with high quality and efficiency. To address the second problem, we introduce a new graph distance measure based on the maximum common subgraphs (MCS) of two graphs which can thoroughly capture the common as well as different structures of two graphs. Since computing the MCS of two graphs is NP-complete, to answer the top-k graph similarity query efficiently, we propose a fast algorithm which can significantly reduce the number of MCS computations. This algorithm prunes the unqualified graphs based on three lower bounds in which the first two are derived based on the structures of two graphs and the third is obtained based on the triangle property of the distance measure. Three index schemes are designed with different tradeoffs between pruning power and construction cost to assist the query processing. For the third problem, we identify two main issues of the current widely-used discriminative score for feature selection, and introduce a new diversified discriminative score to explore the additional value of the diversity together with the discriminativity. We analyze the properties of the newly-proposed diversified discriminative score from several perspectives and demonstrate that this score can make positive/negative graphs more separable. New algorithms are also proposed to select features based on the new score and they are shown to have high classification accuracy. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Zhu, Yuanyuan. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 137-146). / Abstract also in Chinese. / Abstract --- p.i / Abstract in Chinese --- p.iii / Acknowledgments --- p.iv / Contents --- p.vi / List of Tables --- p.x / List of Figures --- p.xi / Notations --- p.1 / Chapter 1. --- Introduction --- p.1 / Chapter 1.1. --- Motivation --- p.2 / Chapter 1.1.1. --- Large Graph Matching --- p.3 / Chapter 1.1.2. --- Top-k Graph Similarity Query --- p.4 / Chapter 1.1.3. --- Diversified Discriminative Feature Selection --- p.6 / Chapter 1.2. --- Contribution --- p.7 / Chapter 2. --- Preliminaries --- p.10 / Chapter 3. --- Related Work --- p.16 / Chapter 3.1. --- Graph Matching --- p.16 / Chapter 3.1.1. --- Exact Graph Matching --- p.16 / Chapter 3.1.2. --- Approximate Graph Matching --- p.17 / Chapter 3.2. --- Graph Similarity Query --- p.19 / Chapter 3.3. --- Graph Classification --- p.20 / Chapter 4. --- Large Graph Matching --- p.23 / Chapter 4.1. --- Problem Statement --- p.23 / Chapter 4.2. --- An Overview: Construction and Refinement --- p.24 / Chapter 4.3. --- Matching Construction --- p.26 / Chapter 4.3.1. --- Global and Local Node Similarity --- p.26 / Chapter 4.3.2. --- Anchor Selection and Expansion --- p.33 / Chapter 4.3.3. --- Discussion on τ for Anchor Selection --- p.36 / Chapter 4.4. --- Matching Refinement --- p.39 / Chapter 4.4.1. --- Vertex Cover Based Refinement --- p.39 / Chapter 4.4.2. --- Refinement and Its Optimality --- p.41 / Chapter 4.4.3. --- Randomly Refinement Excluding C - F₁ --- p.46 / Chapter 4.4.4. --- Randomly Refinement Including C - F₁ --- p.51 / Chapter 4.5. --- Labeled Graph Handling --- p.54 / Chapter 4.6. --- Experiments --- p.56 / Chapter 4.6.1. --- Comparison with the Approximate Algorithms --- p.59 / Chapter 4.6.2. --- Comparison with the Exact Algorithm --- p.63 / Chapter 4.6.3. --- Parameter and Scalability Testing --- p.65 / Chapter 4.6.4. --- Sensitivity of Randomness (PN) --- p.69 / Chapter 4.6.5. --- Effectiveness of Label Distribution --- p.70 / Chapter 4.7. --- Summary --- p.72 / Chapter 5. --- Top-k Graph Similarity Query --- p.73 / Chapter 5.1. --- Problem Statement --- p.73 / Chapter 5.2. --- The Framework --- p.78 / Chapter 5.3. --- Pruning without Indexing --- p.80 / Chapter 5.3.1. --- Edge Frequency Based Lower Bound --- p.80 / Chapter 5.3.2. --- Adjacency List Based Lower Bound --- p.82 / Chapter 5.3.3. --- Query Processing --- p.84 / Chapter 5.4. --- Pruning with Indexing --- p.85 / Chapter 5.4.1. --- The Triangle Property of Graph Distance --- p.86 / Chapter 5.4.2. --- Query Processing --- p.88 / Chapter 5.4.3. --- Indexing --- p.92 / Chapter 5.4.4. --- Discussion on the Generality of Our Framework --- p.94 / Chapter 5.5. --- Experiments --- p.94 / Chapter 5.5.1. --- Similarity Measures Evaluation --- p.96 / Chapter 5.5.2. --- Query Performance Evaluation --- p.98 / Chapter 5.5.3. --- Indexing Cost Evaluation --- p.102 / Chapter 5.6. --- Summary --- p.103 / Chapter 6. --- Diversified Discriminative Feature Selection --- p.105 / Chapter 6.1. --- Problem Statement --- p.105 / Chapter 6.2. --- Discriminative Score --- p.108 / Chapter 6.2.1. --- The Single Feature Discriminative Score --- p.109 / Chapter 6.2.2. --- A New Diversified Discriminative Score --- p.110 / Chapter 6.3. --- Property Statistics of Discriminative Score --- p.113 / Chapter 6.4. --- The Algorithms --- p.117 / Chapter 6.5. --- Ensemble D&D --- p.121 / Chapter 6.6. --- Experiments --- p.123 / Chapter 6.6.1. --- D&D Performance Analysis --- p.126 / Chapter 6.6.2. --- Comparison with Existing Algorithms --- p.127 / Chapter 6.6.3. --- Performance on Patterns Mined by GAIA --- p.129 / Chapter 6.7. --- Summary --- p.131 / Chapter 7. --- Conclusion and FutureWork --- p.132 / Chapter 7.1. --- Conclusion --- p.132 / Chapter 7.2. --- Future work --- p.134 / Bibliography --- p.136 Databases Graph theory--Data processing Data mining Querying (Computer science) Data structures (Computer science)

Search results