Global ETD Search

1	A Distributed Range Query Framework for Internet of Things Zhang, Congcong January 2014 (has links) With the rapid development of information technology, applications referring to the Internet of things are booming. Applications that gather information from sensors and affect the context environment with actuators can provide customized and intelligent behaviour to users. These applications have become widely used nowadays in daily life and have initiated the multi-dimensional range query demand referring to the Internet of things. As the data information is fully distributed and the devices like sensors, mobile phones, etc., has limited resources and finite energy, supporting efficient range query is a tough challenge. In this paper, we have proposed a distributed range query framework for Internet of things. In order to save energy costs and reduce the network traffic, we suggest a reporting data range mechanism in the sensing peers, which choose to report a data range and report again only when the peer senses an abnormal data instead of the common moving data method. In addition, we selected some strong peers to be used as the super peers to create a data index by collecting the reporting data range, which will be used for performing range queries. The study has shown that our proposal framework could reduce resource costs in the less strong peers like sensors and mobile phones, and reduce network traffic among all the peers within the network, as well as support a range query function. According the evaluation results, the reporting data range method could greatly reduce the data migration times and save energy costs, and the data index could significantly reduce accessing unnecessary peers and diminish the network traffic.
2	Load-balanced Range Query Workload Partitioning for Compressed Spatial Hierarchical Bitmap (cSHB) Indexes January 2018 (has links) abstract: The spatial databases are used to store geometric objects such as points, lines, polygons. Querying such complex spatial objects becomes a challenging task. Index structures are used to improve the lookup performance of the stored objects in the databases, but traditional index structures cannot perform well in case of spatial databases. A significant amount of research is made to ingest, index and query the spatial objects based on different types of spatial queries, such as range, nearest neighbor, and join queries. Compressed Spatial Bitmap Index (cSHB) structure is one such example of indexing and querying approach that supports spatial range query workloads (set of queries). cSHB indexes and many other approaches lack parallel computation. The massive amount of spatial data requires a lot of computation and traditional methods are insufficient to address these issues. Other existing parallel processing approaches lack in load-balancing of parallel tasks which leads to resource overloading bottlenecks. In this thesis, I propose novel spatial partitioning techniques, Max Containment Clustering and Max Containment Clustering with Separation, to create load-balanced partitions of a range query workload. Each partition takes a similar amount of time to process the spatial queries and reduces the response latency by minimizing the disk access cost and optimizing the bitmap operations. The partitions created are processed in parallel using cSHB indexes. The proposed techniques utilize the block-based organization of bitmaps in the cSHB index and improve the performance of the cSHB index for processing a range query workload. / Dissertation/Thesis / Masters Thesis Computer Science 2018 Computer science partitioning range query spatial index spatial queries
3	NAAK-Tree: An Index for Querying Spatial Approximate Keywords Liou, Yen-Guo 11 July 2012 (has links) ¡@¡@In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. Many of these applications allow users to find objects with keywords and spatial information at the same time. Most researches in the spatial keyword queries only consider the exact match between the database and query with the textual information. Since users may not know how to spell the exact keyword, they make a query with the approximate-keyword, instead of the exact keyword. Therefore, how to process the approximate-keyword query in the spatial database becomes an important research topic. Alsubaiee et al. have proposed the Location-Based-Approximate-Keyword-tree (LBAK-tree) index structure which is to augment a tree-based spatial index with approximate-string indexes such as a gram-based index. However, the LBAK-tree index structure is the R-tree based index structure. The nodes of the R-tree have to be split and be reinserted when they get full. Due to this condition, it can not index the spatial attribute and the textual attribute at the same time. It stores the keywords in the nodes after the R-tree is already built. Based on the R-tree, it has to search all the children in a node to insert a new item and answer a query. Moreover, after they find the needed keywords by using the approximate index, they probe the nodes by checking the intersection of the similar keyword sets and the keywords stored in the nodes. However, the higher level the node is, the larger the number of keywords stored in the node is. It takes long time to check the intersections. And the LBAK-tree checks all the intersections even if there exits one of the intersections which is already an empty set. Therefore, in this thesis, we propose the Nine-Area-Approximate-Keyword-tree (NAAK-tree) index structure to process the spatial approximate-keyword query. We do not have to partition the space to construct the spatial index. We do not have to reinsert the children when split the nodes, so we can deal with the keywords at the same time. We can use the spatial number to find out the nodes that satisfy the spatial condition of the query. And we augment the NAAK-tree with signatures to speed up the query of the textual condition. We use the union of the bit strings of each keyword in a node to represent them in the node. Therefore, we can efficiently filter out the nodes that there is no keyword corresponding to the query by checking the signatures just one time without checking all the keywords stored in the nodes. Based on our NAAK-tree, if there exits one empty set in the similar keywords sets, we do not check all the similar keywords sets. From our simulation results, we show that the NAAK-tree is more efficient than the LBAK-tree to build the index and answer the spatial approximate-keyword query. Signature Index Structure Approximate-Keyword Spatial Database Range Query
4	A Count-Based Partition Approach to the Design of the Range-Based Bitmap Indexes for Data Warehouses Lin, Chien-Hsiu 29 July 2004 (has links) Data warehouses contain data consolidated from several operational databases and provide the historical, and summarized data which is more appropriate for analysis than detail, individual records. On-Line Analytical Processing (OLAP) provides advanced analysis tools to extract information from data stored in a data warehouse. Fast response time is essential for on-line decision support. A bitmap index could reach this goal in read-mostly environments. When data has high cardinality, we prefer to use the Range-Based Index (RBI), which divides the attributes values into several partitions and a bitmap vector is used to represent a range. With RBI, however, the number of records assigned to different ranges can be highly unbalanced, resulting in different search times of disk accesses for different queries. Wu et al proposed an algorithm for RBI, DBEC, which takes the data distribution into consideration. But the DBEC strategy could not guarantee to get the partition result with the given number of bitmap vectors, PN. Moreover, for different data records with the same value, they may be partitioned into different bitmap vectors which takes long disk I/O time. Therefore, we propose the IPDF, CP, CP* strategies for constructing the dynamic range-based indexes concerning with the case that data has high cardinality and is not uniformly distributed. The IPDF strategy decides each partition according to the Probability Density Function (p.d.f.). The CP strategy sorts the data and partitions them into PN groups for every w continuous records. The CP* strategy is an improved version of the CP strategy by adjusting the cutting points such that data records with the same value will be assigned into the same partition. On the other hand, we could take the history of users' queries into consideration. Based on the greedy approach, we propose the GreedyExt and GreedyRange strategies. The GreedyExt strategy is used for answering exact queries and the GreedyRange strategy is used for answering range queries. The two strategies decide the set of queries to construct the bitmap vectors such that the average response time of answering queries could be reduced. Moreover, a bitmap index consists of a set of bitmap vectors and the size of the bitmap index could be much larger than the capacity of the disk. We propose the FZ strategy to compress each bitmap vector to reduce the size of the storage space and provide efficient bitwise operations without decompressing these bitmap vectors. Finally, from our performance analysis, the performance of the CP* strategy could be better than the CP strategy in terms of the number of disk accesses. From our simulation, we show that the ranges divided by the IPDF and CP* strategies are more uniform than those divided by the DBEC strategy. The GreedyExt and GreedyRange strategies could provide fast response time in most of situations. Moreover, the FZ strategy could reduce the storage space more than the WAH strategy. bitmap index range query data warehouse compress OLAP
5	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
6	Efficient Range and Join Query Processing in Massively Distributed Peer-to-Peer Networks Wang, Qiang January 2008 (has links) Peer-to-peer (P2P) has become a modern distributed computing architecture that supports massively large-scale data management and query processing. Complex query operators such as range operator and join operator are needed by various distributed applications, including content distribution, locality-aware services, computing resource sharing, and many others. This dissertation tackles a number of problems related to range and join query processing in P2P systems: fault-tolerant range query processing under structured P2P architecture, distributed range caching under unstructured P2P architecture, and integration of heterogeneous data under unstructured P2P architecture. To support fault-tolerant range query processing so as to provide strong performance guarantees in the presence of network churn, effective replication schemes are developed at either the overlay network level or the query processing level. To facilitate range query processing, a prefetch-based caching approach is proposed to eliminate the performance bottlenecks incurred by those data items that are not well cached in the network. Finally, a purely decentralized partition-based join query operator is devised to realize bandwidth-efficient join query processing under unstructured P2P architecture. Theoretical analysis and experimental simulations demonstrate the effectiveness of the proposed approaches. peer to peer networks range query processing join query processing Computer Science
7	BATON: A Balanced Tree Structure for Peer-to-Peer Networks Jagadish, H.V., Ooi, Beng Chin, Rinard, Martin C., Vu, Quang Hieu 01 1900 (has links) We propose a balanced tree structure overlay on a peer-to-peer network capable of supporting both exact queries and range queries efficiently. In spite of the tree structure causing distinctions to be made between nodes at different levels in the tree, we show that the load at each node is approximately equal. In spite of the tree structure providing precisely one path between any pair of nodes, we show that sideways routing tables maintained at each node provide sufficient fault tolerance to permit efficient repair. Specifically, in a network with N nodes, we guarantee that both exact queries and range queries can be answered in O(logN) steps and also that update operations (to both data and network) have an amortized cost of O(logN). An experimental assessment validates the practicality of our proposal. / Singapore-MIT Alliance (SMA) Balanced Tree Structure Load Balancing peer-to-peer system range query
8	Novel spatial query processing techniques for scaling location based services Pesti, Peter 12 November 2012 (has links) Location based services (LBS) are gaining widespread user acceptance and increased daily usage. GPS based mobile navigation systems (Garmin), location-related social network updates and "check-ins" (Facebook), location-based games (Nokia), friend queries (Foursquare) and ads (Google) are some of the popular LBSs available to mobile users today. Despite these successes, current user services fall short of a vision where mobile users could ask for continuous location-based services with always-up-to-date information around them, such as the list of friends or favorite restaurants within 15 minutes of driving. Providing such a location based service in real time faces a number of technical challenges. In this dissertation research, we propose a suite of novel techniques and system architectures to address some known technical challenges of continuous location queries and updates. Our solution approaches enable the creation of new, practical and scalable location based services with better energy efficiency on mobile clients and higher throughput at the location servers. Our first contribution is the development of RoadTrack, a road network aware and query-aware location update framework and a suite of algorithms. A unique characteristic of RoadTrack is the innovative design of encounter points and system-defined precincts to manage the desired spatial resolution of location updates for different mobile clients while reducing the complexity and energy consumption of location update strategies. The second novelty of this dissertation research is the technical development of Dandelion data structures and algorithms that can deliver superior performance for the periodic re-evaluation of continuous road-network distance based location queries, when compared with the alternative of repeatedly performing a network expansion along a mobile user's trajectory. The third contribution of this dissertation research is the FastExpand algorithm that can speed up the computation of single-issue shortest-distance road network queries. Finally, we have developed the open source GT MobiSim mobility simulator, a discrete event simulation platform to generate realistic driving trajectories for real road maps. It has been downloaded and utilized by many to evaluate the efficiency and effectiveness of the location query and location update algorithms, including the research efforts in this dissertation. CQ Range query Road network Simulation Lbs Location based services Mobile Query Continuous query Querying (Computer science) Database searching System design
9	Data Distribution Management In Large-scale Distributed Environments Gu, Yunfeng 15 February 2012 (has links) Data Distribution Management (DDM) deals with two basic problems: how to distribute data generated at the application layer among underlying nodes in a distributed system and how to retrieve data back whenever it is necessary. This thesis explores DDM in two different network environments: peer-to-peer (P2P) overlay networks and cluster-based network environments. DDM in P2P overlay networks is considered a more complete concept of building and maintaining a P2P overlay architecture than a simple data fetching scheme, and is closely related to the more commonly known associative searching or queries. DDM in the cluster-based network environment is one of the important services provided by the simulation middle-ware to support real-time distributed interactive simulations. The only common feature shared by DDM in both environments is that they are all built to provide data indexing service. Because of these fundamental differences, we have designed and developed a novel distributed data structure, Hierarchically Distributed Tree (HD Tree), to support range queries in P2P overlay networks. All the relevant problems of a distributed data structure, including the scalability, self-organizing, fault-tolerance, and load balancing have been studied. Both theoretical analysis and experimental results show that the HD Tree is able to give a complete view of system states when processing multi-dimensional range queries at different levels of selectivity and in various error-prone routing environments. On the other hand, a novel DDM scheme, Adaptive Grid-based DDM scheme, is proposed to improve the DDM performance in the cluster-based network environment. This new DDM scheme evaluates the input size of a simulation based on probability models. The optimum DDM performance is best approached by adapting the simulation running in a mode that is most appropriate to the size of the simulation. Data Distribution Management DDM Range Query Associative searching Multi-dimensional Simulation Distributed P2P HLA/RTI Cluster Data structure Overlay Region-based Grid-based HD Tree AGB DDM
10	Data Distribution Management In Large-scale Distributed Environments Gu, Yunfeng 15 February 2012 (has links) Data Distribution Management (DDM) deals with two basic problems: how to distribute data generated at the application layer among underlying nodes in a distributed system and how to retrieve data back whenever it is necessary. This thesis explores DDM in two different network environments: peer-to-peer (P2P) overlay networks and cluster-based network environments. DDM in P2P overlay networks is considered a more complete concept of building and maintaining a P2P overlay architecture than a simple data fetching scheme, and is closely related to the more commonly known associative searching or queries. DDM in the cluster-based network environment is one of the important services provided by the simulation middle-ware to support real-time distributed interactive simulations. The only common feature shared by DDM in both environments is that they are all built to provide data indexing service. Because of these fundamental differences, we have designed and developed a novel distributed data structure, Hierarchically Distributed Tree (HD Tree), to support range queries in P2P overlay networks. All the relevant problems of a distributed data structure, including the scalability, self-organizing, fault-tolerance, and load balancing have been studied. Both theoretical analysis and experimental results show that the HD Tree is able to give a complete view of system states when processing multi-dimensional range queries at different levels of selectivity and in various error-prone routing environments. On the other hand, a novel DDM scheme, Adaptive Grid-based DDM scheme, is proposed to improve the DDM performance in the cluster-based network environment. This new DDM scheme evaluates the input size of a simulation based on probability models. The optimum DDM performance is best approached by adapting the simulation running in a mode that is most appropriate to the size of the simulation. Data Distribution Management DDM Range Query Associative searching Multi-dimensional Simulation Distributed P2P HLA/RTI Cluster Data structure Overlay Region-based Grid-based HD Tree AGB DDM

Search results