61 |
Query optimization in XML based information integration for queries involving aggregation and group byAlkaldi, Wejdan Abdullah. January 1900 (has links)
Thesis (M.S.)--The University of North Carolina at Greensboro, 2009. / Directed by Fereidoon Sadri; submitted to the Dept. of Computer Science. Title from PDF t.p. (viewed May 25, 2010). Includes bibliographical references (p. 47).
|
62 |
Efficient group queries in location-based social networksLi, Yafei 26 June 2015 (has links)
Nowadays, with the rapid development of GPS-equipped mobile devices, location-based social networks have been emerging to bridge the gap between the physical world and online social networking services. Various types of data, such as personal locations, check-ins, microblogs and social relations, have been available in location-based social networks. Efficiently managing and analyzing such data to meet users' daily query requirements become a challenging task. Among all the existing works in location-based social networks, group query is one of the most important research topics. In this thesis, we investigate query techniques for location-based services in social networking applications. Specifically, considering a location-based social network, we study spatial-aware interest group queries, geo-social {dollar}k{dollar}-cover group queries, and social-aware ridesharing group queries. Firstly, we study the spatial-aware interest group queries in location-based social networks. Recently, most of the location-based social networks release check-in services that allow users to share their visiting locations with their friends. These locations, considered as spatial objects, are usually associated with a few tags that describe the features of those locations. Utilizing such information, we propose a new type of \emph{Spatial-aware Interest Group} (SIG) query that retrieves a user group of size {dollar}k{dollar} where each user is interested in the query keywords and the users are close to each other in the Euclidean space. We prove this query problem is NP-complete, and develop two efficient algorithms IOAIR and DOAIR based on the IR-tree for the processing of SIG queries. We also validate the performance efficiency of the proposed query processing algorithms by empirical evaluation. Secondly, we study the problem of geo-social {dollar}k{dollar}-cover group queries for collaborative spatial computing. In this problem, we propose a novel type of geo-social queries, called \emph{Geo-Social K-Cover Group} (GSKCG) query, which is based on spatial containment and a new modeling of social relationships. Intuitively, given a set of spatial query points and an underlying social network, a GSKCG query finds a minimum user group in which the members satisfy certain social relationship and their associated regions can jointly cover all the query points. Albeit its practical usefulness, the GSKCG query problem is NP-complete. We consequently explore a set of effective pruning strategies to derive an efficient algorithm for finding the optimal solution. Moreover, we design a novel index structure tailored to our problem to further accelerate query processing. Extensive experiments demonstrate that our algorithm achieves desirable performance on real-life datasets. Thirdly, we study the problem of social-aware ridesharing group queries. With the deep penetration of smartphones and geo-locating devices, ridesharing is envisioned as a promising solution to transportation-related problems such as congestion and air pollution for metropolitan cities. Despite the potential to provide significant societal and environmental benefits, ridesharing has not so far been as popular as expected. Notable barriers include the social discomfort and safety concerns when traveling with strangers. To overcome these barriers, in this thesis, we propose a new type of \emph{Social-aware Ridesharing Group} (SaRG) query which retrieves a group of riders by taking into account their social connections besides traditional spatial proximities. Because the SaRG query problem is NP-hard, we design an efficient algorithm with a set of powerful pruning techniques to tackle this problem. We also present several incremental strategies to accelerate the search speed by reducing the repeated computations. Moreover, we propose a novel index tailored to the proposed problem to further speed up the query processing. Experimental results on real datasets show that our proposed algorithms achieve desirable performance. The works of this thesis show that the group query processing techniques are effective, which would facilitate the wider deployment of such query services in real applications
|
63 |
Optimizing Query Processing Under SkewZhang, Wangda January 2020 (has links)
Big data systems such as relational databases, data science platforms, and scientific workflows all process queries over large and complex datasets. Skew is common in these real-world datasets and workloads. Different types of skew can have different impacts on the performance of query processing. Although skew sometimes causes load imbalance in a parallel execution environment, negatively impacting query performance, we demonstrate in this thesis that, in many cases we can actually improve the query performance in the presence of skew. To optimize query processing under skew, we develop a set of techniques to exploit the positive effects of skew and to avoid the negative effects. In order to exploit skew, we propose techniques including: (a) intentionally creating skew and clustering data in a distributed database system; (b) optimizing data layout for better caching in main-memory databases; and (c) adaptive execution techniques that are responsive to the underlying data in the context of compilers. In order to ameliorate skew, we study optimized hash-based partitioning that alleviate outliers in a genomic data context, as well as parallel prefix sum algorithms that used to develop skew-insensitive algorithms. We evaluate the effectiveness of our techniques over synthetic data, standard benchmarks, as well as empirical datasets, and show that the performance of query processing under skew can be greatly improved. Overall this thesis has made a concrete contribution to skew-related query processing.
|
64 |
A Data-Descriptive Feedback Framework for Data Stream Management SystemsFernández Moctezuma, Rafael J. 01 January 2012 (has links)
Data Stream Management Systems (DSMSs) provide support for continuous query evaluation over data streams. Data streams provide processing challenges due to their unbounded nature and varying characteristics, such as rate and density fluctuations. DSMSs need to adapt stream processing to these changes within certain constraints, such as available computational resources and minimum latency requirements in producing results. The proposed research develops an inter-operator feedback framework, where opportunities for run-time adaptation of stream processing are expressed in terms of descriptions of substreams and actions applicable to the substreams, called feedback punctuations. Both the discovery of adaptation opportunities and the exploitation of these opportunities are performed in the query operators. DSMSs are also concerned with state management, in particular, state derived from tuple processing. The proposed research also introduces the Contracts Framework, which provides execution guarantees about state purging in continuous query evaluation for systems with and without inter-operator feedback. This research provides both theoretical and design contributions. The research also includes an implementation and evaluation of the feedback techniques in the NiagaraST DSMS, and a reference implementation of the Contracts Framework.
|
65 |
Window Queries Over Data StreamsLi, Jin 01 October 2008 (has links)
Evaluating queries over data streams has become an appealing way to support various stream-processing applications. Window queries are commonly used in many stream applications. In a window query, certain query operators, especially blocking operators and stateful operators, appear in their windowed versions. Previous research work in evaluating window queries typically requires ordered streams and this order requirement limits the implementations of window operators and also carries performance penalties. This thesis presents efficient and flexible algorithms for evaluating window queries. We first present a new data model for streams, progressing streams, that separates stream progress from physical-arrival order. Then, we present our window semantic definitions for the most commonly used window operators—window aggregation and window join. Unlike previous research that often requires ordered streams when describing window semantics, our window semantic definitions do not rely on physical-stream arrival properties. Based on the window semantic definitions, we present new implementations of window aggregation and window join, WID and OA-Join. Compared to the existing implementations of stream query operators, our implementations do not require special stream-arrival properties, particularly stream order. In addition, for window aggregation, we present two other implementations extended from WID, Paned-WID and AdaptWID, to improve excution time by sharing sub-aggregates and to improve memory usage for input with data distribution skew, respectively. Leveraging our order-insenstive implementations of window operators, we present a new architecture for stream systems, OOP (Out-of- Order Processing). Instead of relying on ordered streams to indicate stream progress, OOP explicitly communicates stream progress to query operators, and thus is more flexible than the previous in-order processing (IOP) approach, which requires maintaining stream order. We implemented our order-insensitive window query operators and the OOP architecture in NiagaraST and Gigascope. Our performance study in both systems confirms the benefits of our window operator implementations and the OOP architecture compared to the commonly used approaches in terms of memory usage, execution time and latency.
|
66 |
Query Processing In Location-based ServicesLiu, Fuyu 01 January 2010 (has links)
With the advances in wireless communication technology and advanced positioning systems, a variety of Location-Based Services (LBS) become available to the public. Mobile users can issue location-based queries to probe their surrounding environments. One important type of query in LBS is moving monitoring queries over mobile objects. Due to the high frequency in location updates and the expensive cost of continuous query processing, server computation capacity and wireless communication bandwidth are the two limiting factors for large-scale deployment of moving object database systems. To address both of the scalability factors, distributed computing has been considered. These schemes enable moving objects to participate as a peer in query processing to substantially reduce the demand on server computation, and wireless communications associated with location updates. In the first part of this dissertation, we propose a distributed framework to process moving monitoring queries over moving objects in a spatial network environment. In the second part of this dissertation, in order to reduce the communication cost, we leverage both on-demand data access and periodic broadcast to design a new hybrid distributed solution for moving monitoring queries in an open space environment. Location-based services make our daily life more convenient. However, to receive the services, one has to reveal his/her location and query information when issuing locationbased queries. This could lead to privacy breach if these personal information are possessed by some untrusted parties. In the third part of this dissertation, we introduce a new privacy protection measure called query l-diversity, and provide two cloaking algorithms to achieve both location kanonymity and query l-diversity to better protect user privacy. In the fourth part of this dissertation, we design a hybrid three-tier architecture to help reduce privacy exposure. In the fifth part of this dissertation, we propose to use Road Network Embedding technique to process privacy protected queries.
|
67 |
Dynamic Optimization and Migration of Continuous Queries Over Data StreamsZhu, Yali 23 August 2006 (has links)
"Continuous queries process real-time streaming data and output results in streams for a wide range of applications. Due to the fluctuating stream characteristics, a streaming database system needs to dynamically adapt query execution. This dissertation proposes novel solutions to continuous query adaptation in three core areas, namely dynamic query optimization, dynamic plan migration and partitioned query adaptation. Runtime query optimization needs to efficiently generate plans that satisfy both CPU and memory resource constraints. Existing work focus on minimizing intermediate query results, which decreases memory and CPU usages simultaneously. However, doing so cannot assure that both resource constraints are being satisfied, because memory and CPU can be either positively or negatively correlated. This part of the dissertation proposes efficient optimization strategies that utilize both types of correlations to search the entire query plan space in polynomial time when a typical exhaustive search would take at least exponential time. Extensive experimental evaluations have demonstrated the effectiveness of the proposed strategies. Dynamic plan migration is concerned with on-the-fly transition from one continuous plan to a semantically equivalent yet more efficient plan. It is a must to guarantee the continuation and repeatability of dynamic query optimization. However, this research area has been largely neglected in the current literature. The second part of this dissertation proposes migration strategies that dynamically migrate continuous queries while guaranteeing the integrity of the query results, meaning there are no missing, duplicate or incorrect results. The extensive experimental evaluations show that the proposed strategies vary significantly in terms of output rates and memory usages given distinct system configurations and stream workloads. Partitioned query processing is effective to process continuous queries with large stateful operators in a distributed system. Dynamic load redistribution is necessary to balance uneven workload across machines due to changing stream properties. However, existing solutions generally assume static query plans without runtime query optimization. This part of the dissertation evaluates the benefits of applying query optimization in partitioned query processing and shows dramatic performance improvement of more than 300%. Several load balancing strategies are then proposed to consider the heterogeneity of plan shapes across machines caused by dynamic query optimization. The effectiveness of the proposed strategies is analyzed through extensive experiments using a cluster."
|
68 |
Human-centered semantic retrieval in multimedia databasesChen, Xin. January 2008 (has links) (PDF)
Thesis (Ph. D.)--University of Alabama at Birmingham, 2008. / Additional advisors: Barrett R. Bryant, Yuhua Song, Alan Sprague, Robert W. Thacker. Description based on contents viewed Oct. 8, 2008; title from PDF t.p. Includes bibliographical references (p. 172-183).
|
69 |
A personalised query expansion approach using contextSeher, Indra. January 2007 (has links)
Thesis (Ph.D.)--University of Western Sydney, 2007. / A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy to the College of Health & Science, School of Computing and Mathematics, University of Western Sydney. Includes bibliography.
|
70 |
Query authentication in data outsourcing and integration servicesChen, Qian 27 August 2015 (has links)
Owing to the explosive growth of data driven by e-commerce, social media, and mobile apps, data outsourcing and integration have become two popular Internet services. These services involve one or more data owners (DOs), many requesting clients, and a service provider (SP). The DOs outsource/synchronize their data to the SP, and the SP will provide query services to the requesting clients on behalf of DOs. However, as a third-party server, the SP might alter (leave out or forge) the outsourced/integrated data and query results, intentionally or not. To address this trustworthy issue, the SP is expected to deliver their services in an authenticatable manner, so that the correctness of the service results can be verified by the clients. Unfortunately, existing work on query authentication cannot preserve the privacy of the data being queried. Furthermore, almost all previous studies assume only a single data source/owner, while data integration services usually combine data from multiple sources. In this dissertation, we take the first step to study the authentication of location-based queries with confidentiality and investigate authenticated online data integration services. Cost models, security analysis, and experimental results consistently show the effectiveness and robustness of our proposed schemes under various system settings and query workloads.
|
Page generated in 0.1127 seconds