Global ETD Search

1	Summarization of very large spatial dataset Liu, Qing, Computer Science & Engineering, Faculty of Engineering, UNSW January 2006 (has links) Nowadays there are a large number of applications, such as digital library information retrieval, business data analysis, CAD/CAM, multimedia applications with images and sound, real-time process control and scientific computation, with data sets about gigabytes, terabytes or even petabytes. Because data distributions are too large to be stored accurately, maintaining compact and accurate summarized information about underlying data is of crucial important. The summarizing problem for Level 1 (disjoint and non-disjoint) topological relationship has been well studied for the past few years. However the spatial database users are often interested in a much richer set of spatial relations such as contains. Little work has been done on summarization for Level 2 topological relationship which includes contains, contained, overlap, equal and disjoint relations. We study the problem of effective summatization to represent the underlying data distribution to answer window queries for Level 2 topological relationship. Cell-density based approach has been demonstrated as an effective way to this problem. But the challenges are the accuracy of the results and the storage space required which should be linearly proportional to the number of cells to be practical. In this thesis, we present several novel techniques to effectively construct cell density based spatial histograms. Based on the framework proposed, exact results could be obtained in constant time for aligned window queries. To minimize the storage space of the framework, an approximate algorithm with the approximate ratio 19/12 is presented, while the problem is shown NP-hard generally. Because the framework requires only a storage space linearly proportional to the number of cells, it is practical for many popular real datasets. To conform to a limited storage space, effective histogram construction and query algorithms are proposed which can provide approximate results but with high accuracy. The problem for non-aligned window queries is also investigated and techniques of un-even partitioned space are developed to support non-aligned window queries. Finally, we extend our techniques to 3D space. Our extensive experiments against both synthetic and real world datasets demonstrate the efficiency of the algorithms developed in this thesis. summarization spatial database histogram
2	Computational Methods for Spatial OLAP Baltzer, Oliver 12 April 2011 (has links) Data warehousing and On-line Analytical Processing (OLAP) are powerful tools for processing and analyzing business and analytical data. It is estimated that 80% of the data stored in data warehouses have some spatial components. It is our belief that there is a need for powerful OLAP tools that are capable of processing and analyzing spatial data. This thesis explores the design and implementation of Spatial OLAP (SOLAP) systems and describes approaches to support the characteristic features of OLAP while seamlessly integrating spatial data into the analysis process. In particular, we analyze the evaluation of OLAP queries in the presence of asymmetric, multiple-alternative, generalized, and non-strict spatial dimension hierarchies. We introduce a new pipeline-based query evaluation model that is comprehensive and powerful in that it provides a uniform approach to the expression of spatial OLAP queries that address all major dimension hierarchy types while permitting a uniform treatment of both spatial and non-spatial data. A reference implementation called "LISA" validates the objectives of our model and demonstrates favorable scalability and performance on modern multi-processor and multi-core hardware platforms. We also describe a new "geoCUBE" index, to address the fundamental problem of how to represent, index and efficiently query data that is defined by a mix of spatial and categorical attribute values. The geoCUBE index extends existing methods for indexing OLAP data to spatial data types. The effectiveness of the geoCUBE data structure is confirmed through evaluation. Lastly, we propose algorithms that facilitate OLAP-like analysis of moving object data. We introduce a new class of GROUP BY operators specifically targeted to the OLAP analysis of trajectories and to answering aggregate queries with respect to the spatio-temporal movement of a set of objects. Through an experimental evaluation we show our operators can be used to reliably identify groups of related trajectories when applied to synthetic and real world moving object data. OLAP, spatial, database
3	Storing Protein Structure in Spatial Database Yeung, Tony 12 May 2005 (has links) In recent years, the field of bioinformatics has exploded in a scale that is unprecedented. The amount of data generated from different genome projects demands a new and efficient way of information storage and retrieval. The analysis and management of the protein structure information has become one of the main focuses. It is well-known that a protein’s functions differ depending on its structure’s position in 3-dimensional space. Due to the fact that protein structures are exceedingly large, complex, and multi-dimensional, there is a need for a data model that can fulfill the requirements of storing protein structures in accordance to its spatial arrangement and topological relationships and, at the same time, provide tools to analyze the information stored. With the emergence of spatial database, first used in the field of Geographical Information Systems, the data model for protein structure could be based on the geographic model, as they share several similar uncanny traits. The geometry of proteins can be modeled using the spatial types provided in a spatial database. In a similar way, special geometry queries used for geographical analysis can also be used to provide information for analysis on the structure of the proteins. This thesis will explore the mechanics of extracting structural information for a protein from a flat file (PDB), storing that information into a spatial data model based on a spatial data model, and making analysis using geometric operators provided by the spatial database. The database used is Oracle 9i. Most features are provided by the Oracle Spatial package. Queries using the ideas aforementioned will be demonstrated. protein structures spatial database Computer Sciences
4	A metadata management system for web based SDIs Phillips, Andrew Heath Unknown Date (has links) The process of decision making is best undertaken with the consideration of as much information as possible. One way to maximise the amount of information that is being used in the process is to use metadata engines. Metadata engines can be used to create virtual databases which are a collection of individual datasets located over a network. Virtual databases allow decisions to be made using data from many different data bases at many different locations on a network. They shield the user from this fact. From the users point of view they are only using data from the one location. This thesis investigates some of the concepts behind metadata engines for Internet based Spatial Data Infrastructures. The thesis has a particular emphasis on how metadata engines can be used to create virtual databases that could be of use in the planning and decision making processes. The thesis also investigates some current spatial data technologies such as SDIs, data warehouses, data marts and clearing houses, their interoperability and their relationship to metadata engines. It also explores some of the more recent spatial data applications that have been developed in the context of metadata engines and Spatial Data Infrastructures.
5	Toward a Data-Type-Based Real Time Geospatial Data Stream Management System Zhang, Chengyang 05 1900 (has links) The advent of sensory and communication technologies enables the generation and consumption of large volumes of streaming data. Many of these data streams are geo-referenced. Existing spatio-temporal databases and data stream management systems are not capable of handling real time queries on spatial extents. In this thesis, we investigated several fundamental research issues toward building a data-type-based real time geospatial data stream management system. The thesis makes contributions in the following areas: geo-stream data models, aggregation, window-based nearest neighbor operators, and query optimization strategies. The proposed geo-stream data model is based on second-order logic and multi-typed algebra. Both abstract and discrete data models are proposed and exemplified. I further propose two useful geo-stream operators, namely Region By and WNN, which abstract common aggregation and nearest neighbor queries as generalized data model constructs. Finally, I propose three query optimization algorithms based on spatial, temporal, and spatio-temporal constraints of geo-streams. I show the effectiveness of the data model through many query examples. The effectiveness and the efficiency of the algorithms are validated through extensive experiments on both synthetic and real data sets. This work established the fundamental building blocks toward a full-fledged geo-stream database management system and has potential impact in many applications such as hazard weather alerting and monitoring, traffic analysis, and environmental modeling. Data stream sensor networks spatial database
6	Efficient Concurrent Operations in Spatial Databases Dai, Jing 16 November 2009 (has links) As demanded by applications such as GIS, CAD, ecology analysis, and space research, efficient spatial data access methods have attracted much research. Especially, moving object management and continuous spatial queries are becoming highlighted in the spatial database area. However, most of the existing spatial query processing approaches were designed for single-user environments, which may not ensure correctness and data consistency in multiple-user environments. This research focuses on designing efficient concurrent operations on spatial datasets. Current multidimensional data access methods can be categorized into two types: 1) pure multidimensional indexing structures such as the R-tree family and grid file; 2) linear spatial access methods, represented by the Space-Filling Curve (SFC) combined with B-trees. Concurrency control protocols have been designed for some pure multidimensional indexing structures, but none of them is suitable for variants of R-trees with object clipping, which are efficient in searching. On the other hand, there is no concurrency control protocol designed for linear spatial indexing structures, where the one-dimensional concurrency control protocols cannot be directly applied. Furthermore, the recently designed query processing approaches for moving objects have not been protected by any efficient concurrency control protocols. In this research, solutions for efficient concurrent access frameworks on both types of spatial indexing structures are provided, as well as for continuous query processing on moving objects, for multiple-user environments. These concurrent access frameworks can satisfy the concurrency control requirements, while providing outstanding performance for concurrent queries. Major contributions of this research include: (1) a new efficient spatial indexing approach with object clipping technique, ZR+-tree, that outperforms R-tree and R+-tree on searching; (2) a concurrency control protocol, GLIP, to provide high throughput and phantom update protection on spatial indexing with object clipping; (3) efficient concurrent operations for indices based on linear spatial access methods, which form up the CLAM protocol; (4) efficient concurrent continuous query processing on moving objects for both R-tree-based and linear spatial indexing frameworks; (5) a generic access framework, Disposable Index, for optimal location update and parallel search. / Ph. D. Indexing Query Processing Concurrency Control Spatial Database
7	Evaluation of Shortest Path Query Algorithm in Spatial Databases Lim, Heechul January 2003 (has links) Many variations of algorithms for finding the shortest path in a large graph have been introduced recently due to the needs of applications like the Geographic Information System (GIS) or Intelligent Transportation System (ITS). The primary subjects of those algorithms are materialization and hierarchical path views. Some studies focus on the materialization and sacrifice the pre-computational costs and storage costs for faster computation of a query. Other studies focus on the shortest-path algorithm, which has less pre-computation and storage but takes more time to compute the shortest path. The main objective of this thesis is to accelerate the computation time for the shortest-path queries while keeping the degree of materialization as low as possible. This thesis explores two different categories: 1) the reduction of the I/O-costs for multiple queries, and 2) the reduction of search spaces in a graph. The thesis proposes two simple algorithms to reduce the I/O-costs, especially for multiple queries. To tackle the problem of reducing search spaces, we give two different levels of materializations, namely, the <i>boundary set distance matrix</i> and <i>x-Hop sketch graph</i>, both of which materialize the shortest-path view of the boundary nodes in a partitioned graph. Our experiments show that a combination of the suggested solutions for 1) and 2) performs better than the original Disk-based SP algorithm [7], on which our work is based, and requires much less storage than <i>HEPV</i> [3]. Computer Science Shortest Path Query Spatial Database Pruning Algorithm
8	Evaluation of Shortest Path Query Algorithm in Spatial Databases Lim, Heechul January 2003 (has links) Many variations of algorithms for finding the shortest path in a large graph have been introduced recently due to the needs of applications like the Geographic Information System (GIS) or Intelligent Transportation System (ITS). The primary subjects of those algorithms are materialization and hierarchical path views. Some studies focus on the materialization and sacrifice the pre-computational costs and storage costs for faster computation of a query. Other studies focus on the shortest-path algorithm, which has less pre-computation and storage but takes more time to compute the shortest path. The main objective of this thesis is to accelerate the computation time for the shortest-path queries while keeping the degree of materialization as low as possible. This thesis explores two different categories: 1) the reduction of the I/O-costs for multiple queries, and 2) the reduction of search spaces in a graph. The thesis proposes two simple algorithms to reduce the I/O-costs, especially for multiple queries. To tackle the problem of reducing search spaces, we give two different levels of materializations, namely, the <i>boundary set distance matrix</i> and <i>x-Hop sketch graph</i>, both of which materialize the shortest-path view of the boundary nodes in a partitioned graph. Our experiments show that a combination of the suggested solutions for 1) and 2) performs better than the original Disk-based SP algorithm [7], on which our work is based, and requires much less storage than <i>HEPV</i> [3]. Computer Science Shortest Path Query Spatial Database Pruning Algorithm
9	NAAK-Tree: An Index for Querying Spatial Approximate Keywords Liou, Yen-Guo 11 July 2012 (has links) ¡@¡@In recent years, the geographic information system (GIS) databases develop quickly and play a significant role in many applications. Many of these applications allow users to find objects with keywords and spatial information at the same time. Most researches in the spatial keyword queries only consider the exact match between the database and query with the textual information. Since users may not know how to spell the exact keyword, they make a query with the approximate-keyword, instead of the exact keyword. Therefore, how to process the approximate-keyword query in the spatial database becomes an important research topic. Alsubaiee et al. have proposed the Location-Based-Approximate-Keyword-tree (LBAK-tree) index structure which is to augment a tree-based spatial index with approximate-string indexes such as a gram-based index. However, the LBAK-tree index structure is the R-tree based index structure. The nodes of the R-tree have to be split and be reinserted when they get full. Due to this condition, it can not index the spatial attribute and the textual attribute at the same time. It stores the keywords in the nodes after the R-tree is already built. Based on the R-tree, it has to search all the children in a node to insert a new item and answer a query. Moreover, after they find the needed keywords by using the approximate index, they probe the nodes by checking the intersection of the similar keyword sets and the keywords stored in the nodes. However, the higher level the node is, the larger the number of keywords stored in the node is. It takes long time to check the intersections. And the LBAK-tree checks all the intersections even if there exits one of the intersections which is already an empty set. Therefore, in this thesis, we propose the Nine-Area-Approximate-Keyword-tree (NAAK-tree) index structure to process the spatial approximate-keyword query. We do not have to partition the space to construct the spatial index. We do not have to reinsert the children when split the nodes, so we can deal with the keywords at the same time. We can use the spatial number to find out the nodes that satisfy the spatial condition of the query. And we augment the NAAK-tree with signatures to speed up the query of the textual condition. We use the union of the bit strings of each keyword in a node to represent them in the node. Therefore, we can efficiently filter out the nodes that there is no keyword corresponding to the query by checking the signatures just one time without checking all the keywords stored in the nodes. Based on our NAAK-tree, if there exits one empty set in the similar keywords sets, we do not check all the similar keywords sets. From our simulation results, we show that the NAAK-tree is more efficient than the LBAK-tree to build the index and answer the spatial approximate-keyword query. Signature Index Structure Approximate-Keyword Spatial Database Range Query
10	A study of three paradigms for storing geospatial data: distributed-cloud model, relational database, and indexed flat file Toups, Matthew A 13 May 2016 (has links) Geographic Information Systems (GIS) and related applications of geospatial data were once a small software niche; today nearly all Internet and mobile users utilize some sort of mapping or location-aware software. This widespread use reaches beyond mere consumption of geodata; projects like OpenStreetMap (OSM) represent a new source of geodata production, sometimes dubbed “Volunteered Geographic Information.” The volume of geodata produced and the user demand for geodata will surely continue to grow, so the storage and query techniques for geospatial data must evolve accordingly. This thesis compares three paradigms for systems that manage vector data. Over the past few decades these methodologies have fallen in and out of favor. Today, some are considered new and experimental (distributed), others nearly forgotten (flat file), and others are the workhorse of present-day GIS (relational database). Each is well-suited to some use cases, and poorly-suited to others. This thesis investigates exemplars of each paradigm. geospatial index spatial database OpenStreetMap Accumulo GeoMesa PostGIS Databases and Information Systems Data Storage Systems

Search results