• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 40
  • 10
  • 10
  • 9
  • 7
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 98
  • 98
  • 52
  • 22
  • 18
  • 18
  • 16
  • 15
  • 14
  • 14
  • 13
  • 12
  • 12
  • 11
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Dotazování nad časoprostorovými daty pohybujících se objektů / Querying Spatio-Temporal Data of Moving Objects

Dvořáček, Ondřej January 2009 (has links)
This master's thesis is devoted to the studies of possibilities, which can be used for representation of moving objects data and for querying such spatio-temporal data. It also shows results of the master's thesis created by Ing. Jaroslav Vališ, that should be used for the solution of this master's thesis. But based on the theoretical grounds defined at the beginning of this work was designed and implemented new database extension for saving and querying spatio-temporal data. Special usage of this extension is demonstrated in an example application. This application uses the database extension for the implementation of its own database functions that are domain specific. At the conclusion, there are presented ways of the farther development of this database extension and the results of this master's thesis are there set into the context of the following project, doctoral thesis "Moving objects database".
42

Spatio-temporal Analysis of Urban Heat Island and Heat Wave Evolution using Time-series Remote Sensing Images: Method and Applications

Yang, Bo 11 June 2019 (has links)
No description available.
43

Visualization and Interaction with Temporal Data using Data Cubes in the Global Earth Observation System of Systems / Visualisering och Interaktion av Tidsbaserad Data genom användning av Data Cubes inom Global Earth Observation System of Systems

Adrup, Joakim January 2018 (has links)
The purpose of this study was to explore the usage of data cubes in the context of the Global Earth Observation System of Systems (GEOSS). This study investigated what added benefit could be provided to users of the GEOSS platform by utilizing the capabilities of data cubes. Data cubes in earth observation is a concept for how data should be handled and provided by a data server. It includes aspects such as flexible extraction of subsets and processing capabilities. In this study it was found that the most frequent use case for data cubes was time analysis. One of the main services provided by the GEOSS portal was the discovery and inspection of datasets. In the study a timeline interface was constructed to facilitate the exploration and inspection of datasets with a temporal dimension. The datasets were provided by a data cube, and made use of the data cubes capabilities in retrieving subsets of data along any arbitrary axis. A usability evaluation was conducted on the timeline interface to gain insight into the users requirements and user satisfaction. The results showed that the design worked well in many regards, ranking high in user satisfaction. On a number of points the study highlighted areas of improvement. Providing insight into important design limitations and challenges together with suggestions on how these could be approached in different ways. / Syftet med studien var att undersöka hur Data Cubes kunde komma att användas inom ramarna för Global Earth Observation System of Systems (GEOSS). Vilka fördelar som kunde dras ifrån att utnyttja den potential som data cubes besitter och använda dem i GEOSS plattformen undersöktes i studien. Data cubes för earth observation är ett koncept om hur data ska hanteras och tillhandahållas av datatjänster. Det ämnar bland annat flexibel extrahering av datapartitioner och dataprocesseringsförmågor. I denna studie iakttogs det att det mest frekvent förekommande användningsområdet för data cubes var analys av tid. Ett huvudsyfte med GEOSS portalen var att tillhandahålla användaren med verktyg för att utforska och inspektera dataset. I denna studie tillverkades ett användargränssnitt med en tidslinje för att ge användaren tillgång till att även utforska och inspektera dataset med en tidsdimension. Datasetet tillhandahålls från en data cube och utnyttjar data cubes färdighet i att förse utvalda partitioner av datasetet som kan extraheras längs valfri axel. En användarstudie har gjorts på användargränssnittet för att utvärdera till vilken grad användarna var nöjda och hur det uppfyllde deras krav, för att samla värdefulla insikter. Resultatet visar på att designen presterar väl på flera punkter, den rankar högt i användartillfredsställelse. Med studien klargör även framtida förbättringsmöjligheter och gav insikter om viktiga designbegränsningar och utmaningar. I rapporten diskuteras det hur dessa kan hanteras på olika sätt.
44

Creating and Evaluating an Interactive Visualization Tool For Crowd Trajectory Data / Att bygga och utvärdera ett interaktivt visualiseringsverktyg för gångbanor hos folksamlingar

Sonebo, Christina, Ekelöf, Joel January 2018 (has links)
There is currently no set standard for evaluating visualization environments. Even though the number of visualizations has increased, there is a tendency to overlook the evaluation of their usability. This thesis investigates how a visualization tool for crowd trajectory data can be made using the visualization technique of animated maps and the JavaScript library D3.js. Furthermore it explores how such a visualization tool can be evaluated according to a suggested framework for spatio-temporal data.     The developed tool uses data taken from the UCY Graphics Lab, consisting of 415 trajectories collected from a video recorded at a campus area. User evaluation was performed through a user test with a total of six participants, measuring effectiveness as completed tasks, and satisfaction as ease of use for three different amounts of trajectories. Qualitative data was recorded through using the think aloud protocol to gather feedback to further improve the implementation. The evaluation shows that the visualization tool is usable and effective, and that the technique of animated maps in combination with a heatmap can aid users when exploring and formulating ideas about data of this kind. It is also concluded that the framework is a possible tool to utilize when validating visualization systems for crowd trajectory data. / Det finns i dagsläget ingen etablerad standard för att utvärdera visualiseringssystem. Även om antalet visualiseringar har ökat finns det en tendens att förbise utvärderandet av deras användbarhet. I det här arbetet undersöker vi hur ett visualiseringsverktyg för data av gångbanor hos folksamlingar kan skapas, med hjälp utav visualiseringsmetoden animated maps och JavaScript-biblioteket D3.js. Vidare undersöker vi hur det är möjligt att evaluera ett visualiseringsverktyg utefter ett givet ramverk.  Visualiseringsverktyget använder data från UCY Graphics Lab. Datan består av 415 gångbanor som är insamlade från en videoinspelning av ett campusområde. En utvärdering genomfördes sedan med sex deltagare, där visualiseringens effektivitet och användarvänlighet mättes. Frågorna ställdes för tre olika mängder av gångbanor. Kvalitativa data dokumenterades genom en så kallad ''think aloud'', för att ge återkoppling och förslag på möjliga förbättringar av visualiseringen. Evalueringen visar på att animated maps i kombination med en heatmap kan hjälpa användare att utforska data av gångbanor hos folksamlingar, samt att verktyget är effektivt och användbart. Det är också visat att det ramverk som användes vid evalueringen är ett möjligt verktyg för att validera visualiseringsverktyg av den typ som gjorts i det här projektet.
45

Functional Norm Regularization for Margin-Based Ranking on Temporal Data

Stojkovic, Ivan January 2018 (has links)
Quantifying the properties of interest is an important problem in many domains, e.g., assessing the condition of a patient, estimating the risk of an investment or relevance of the search result. However, the properties of interest are often latent and hard to assess directly, making it difficult to obtain classification or regression labels, which are needed to learn a predictive models from observable features. In such cases, it is typically much easier to obtain relative comparison of two instances, i.e. to assess which one is more intense (with respect to the property of interest). One framework able to learn from such kind of supervised information is ranking SVM, and it will make a basis of our approach. Applications in bio-medical datasets typically have specific additional challenges. First, and the major one, is the limited amount of data examples, due to an expensive measuring technology, and/or infrequency of conditions of interest. Such limited number of examples makes both identification of patterns/models and their validation less useful and reliable. Repeated samples from the same subject are collected on multiple occasions over time, which breaks IID sample assumption and introduces dependency structure that needs to be taken into account more appropriately. Also, feature vectors are highdimensional, and typically of much higher cardinality than the number of samples, making models less useful and their learning less efficient. Hypothesis of this dissertation is that use of the functional norm regularization can help alleviating mentioned challenges, by improving generalization abilities and/or learning efficiency of predictive models, in this case specifically of the approaches based on the ranking SVM framework. The temporal nature of data was addressed with loss that fosters temporal smoothness of functional mapping, thus accounting for assumption that temporally proximate samples are more correlated. Large number of feature variables was handled using the sparsity inducing L1 norm, such that most of the features have zero effect in learned functional mapping. Proposed sparse (temporal) ranking objective is convex but non-differentiable, therefore smooth dual form is derived, taking the form of quadratic function with box constraints, which allows efficient optimization. For the case where there are multiple similar tasks, joint learning approach based on matrix norm regularization, using trace norm L* and sparse row L21 norm was also proposed. Alternate minimization with proximal optimization algorithm was developed to solve the mentioned multi-task objective. Generalization potentials of the proposed high-dimensional and multi-task ranking formulations were assessed in series of evaluations on synthetically generated and real datasets. The high-dimensional approach was applied to disease severity score learning from gene expression data in human influenza cases, and compared against several alternative approaches. Application resulted in scoring function with improved predictive performance, as measured by fraction of correctly ordered testing pairs, and a set of selected features of high robustness, according to three similarity measures. The multi-task approach was applied to three human viral infection problems, and for learning the exam scores in Math and English. Proposed formulation with mixed matrix norm was overall more accurate than formulations with single norm regularization. / Computer and Information Science
46

Semiparametric Varying Coefficient Models for Matched Case-Crossover Studies

Ortega Villa, Ana Maria 23 November 2015 (has links)
Semiparametric modeling is a combination of the parametric and nonparametric models in which some functions follow a known form and some others follow an unknown form. In this dissertation we made contributions to semiparametric modeling for matched case-crossover data. In matched case-crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. Any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. However, some matching covariates such as time, and/or spatial location often play an important role as an effect modification. Failure to include them makes incorrect statistical estimation, prediction and inference. Hence in this dissertation, we propose several approaches that will allow the inclusion of time and spatial location as well as other effect modifications such as heterogeneous subpopulations among the data. To address modification due to time, three methods are developed: the first is a parametric approach, the second is a semiparametric penalized approach and the third is a semiparametric Bayesian approach. We demonstrate the advantage of the one stage semiparametric approaches using both a simulation study and an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis with drinking water turbidity. To address modifications due to time and spatial location, two methods are developed: the first one is a semiparametric spatial-temporal varying coefficient model for a small number of locations. The second method is a semiparametric spatial-temporal varying coefficient model, and is appropriate when the number of locations among the subjects is medium to large. We demonstrate the accuracy of these approaches by using simulation studies, and when appropriate, an epidemiological example of a 1-4 bi-directional case-crossover study. Finally, to explore further effect modifications by heterogeneous subpopulations among strata we propose a nonparametric Bayesian approach constructed with Dirichlet process priors, which clusters subpopulations and assesses heterogeneity. We demonstrate the accuracy of our approach using a simulation study, as well a an example of a 1-4 bi-directional case-crossover study. / Ph. D.
47

Designing Conventional, Spatial, and Temporal Data Warehouses: Concepts and Methodological Framework

Malinowski Gajda, Elzbieta 02 October 2006 (has links)
Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems. A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process. DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail. Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems. In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them. On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise. However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data. With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others. Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions. Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model. In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs. Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects. In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process. To ensure the correctness of the proposed conceptual models, i.e., with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.
48

Redes Bayesianas aplicadas a estimação da taxa de prêmio de seguro agrícola de produtividade / Bayesian networks applied to estimation of yield insurance premium

Polo, Lucas 08 July 2016 (has links)
Informações que caracterizam o risco quebra de produção agrícola são necessárias para a precificação de prêmio do seguro agrícola de produção e de renda. A distribuição de probabilidade da variável rendimento agrícola é uma dessas informações, em especial aquela que descreve a variável aleatória rendimento agrícola condicionada aos fatores de risco climáticos. Este trabalho objetiva aplicar redes Bayesianas (grafo acíclico direcionado, ou modelo hierárquico Bayesiano) a estimação da distribuição de probabilidade de rendimento da soja em alguns municípios do Paraná, com foco na analise comparativa de riscos. Dados meteorológicos (ANA e INMET, período de 1970 a 2011) e de sensoriamento remoto (MODIS, período de 2000 a 2011) são usados conjuntamente para descrever espacialmente o risco climático de quebra de produção. Os dados de rendimento usados no estudo (COAMO, período de 2001 a 2011) requerem agrupamento de todos os dados ao nível municipal e, para tanto, a seleção de dados foi realizada nas dimensões espacial e temporal por meio de um mapa da cultura da soja (estimado por SVM - support vector machine) e os resultados de um algoritmo de identificação de ciclo de culturas. A interpolação requerida para os dados de temperatura utilizou uma componente de tendência estimada por dados de sensoriamento remoto, para descrever variações espaciais da variável que são ofuscadas pelos métodos tradicionais de interpolação. Como resultados, identificou-se relação significativa entre a temperatura observada por estações meteorológicas e os dados de sensoriamento remoto, apoiando seu uso conjunto nas estimativas. O classificador que estima o mapa da cultura da soja apresenta sobre-ajuste para safras das quais as amostras usadas no treinamento foram coletadas. Além da seleção de dados, a identificação de ciclo também permitiu obtenção de distribuições de datas de plantio da cultura da soja para o estado do Paraná. As redes bayesianas apresentam grande potencial e algumas vantagens quando aplicadas na modelagem de risco agrícola. A representação da distribuição de probabilidade por um grafo facilita o entendimento de problemas complexos, por suposições de causalidade, e facilita o ajuste, estruturação e aplicação do modelo probabilístico. A distribuição log-normal demonstrou-se a mais adequada para a modelagem das variáveis de ambiente (soma térmica, chuva acumulada e maior período sem chuva), e a distribuição beta para produtividade relativa e índices de estado (amplitude de NDVI e de EVI). No caso da regressão beta, o parâmetro de precisão também foi modelado com dependência das variáveis explicativas melhorando o ajuste da distribuição. O modelo probabilístico se demonstrou pouco representativo subestimando bastante as taxas de prêmio de seguro em relação a taxas praticadas no mercado, mas ainda assim apresenta contribui para o entendimento comparativo de situações de risco de quebra de produção da cultura da soja. / Information that characterize the risk of crop losses are necessary to crop and revenue insurance underwriting. The probability distribution of yield is one of this information. This research applies Bayesian networks (direct acyclic graph, or hierarchical Bayesian model) to estimate the probability distribution of soybean yield for some counties in Paraná state (Brazil) with focus on risk comparative analysis. Meteorological data (ANA and INMET, from 1970 to 2011) and remote sensing data (MODIS, from 2001 to 2011) were used to describe spatially the climate risk of production loss. The yield data used in this study (COAMO, from 2001 to 2011) required grouping to county level and, for that, a process of data selection was performed on spatial and temporal dimensions by a crop map (estimated by SVM - support vector machine) and by the results of a crop cycle identification algorithm. The interpolation required to spatialize temperature required a trend component which was estimated by remote sensing data, to describe the spatial variations of the variable obfuscated by traditional interpolation methods. As results, a significant relation between temperature from meteorological stations and remote sensing data was found, sustaining the use of the supposed relation between the two variables. The soybean map classifier shown over-fitting for the crop seasons for which the training samples were collected. Besides the data collection, a seeding dates distribution of soybean in Paraná state was obtained from the crop cycle identification process. The Bayesian networks showed big potential and some advantages when applied to agronomic risk modeling. The representation of the probability distribution by graphs helps the understanding of complex problems, with causality suppositions, and also helps the fitting, structuring and application of the probabilistic model. The log-normal probability distribution showed to be the best to model environment variables (thermal sum, accumulated precipitation and biggest period without rain), and the beta distribution to be the best to model relative yield and state indexes (NDVI and EVI ranges). In the case of beta regression, the precision parameter was also modeled with explanation variables as dependencies increasing the quality of the distribution fitting. In the overall, the probabilistic model had low representativity underestimating the premium rates, however it contributes to understand scenarios with risk of yield loss for the soybean crop.
49

Extraction de relations spatio-temporelles à partir des données environnementales et de la santé / Spatio-temporal data mining from health and environment data

Alatrista-Salas, Hugo 04 October 2013 (has links)
Face à l'explosion des nouvelles technologies (mobiles, capteurs, etc.), de grandes quantités de données localisées dans l'espace et dans le temps sont désormais disponibles. Les bases de données associées peuvent être qualifiées de bases de données spatio-temporelles car chaque donnée est décrite par une information spatiale (e.g. une ville, un quartier, une rivière, etc.) et temporelle (p. ex. la date d'un événement). Cette masse de données souvent hétérogènes et complexes génère ainsi de nouveaux besoins auxquels les méthodes d'extraction de connaissances doivent pouvoir répondre (e.g. suivre des phénomènes dans le temps et l'espace). De nombreux phénomènes avec des dynamiques complexes sont ainsi associés à des données spatio-temporelles. Par exemple, la dynamique d'une maladie infectieuse peut être décrite par les interactions entre les humains et le vecteur de transmission associé ainsi que par certains mécanismes spatio-temporels qui participent à son évolution. La modification de l'un des composants de ce système peut déclencher des variations dans les interactions entre les composants et finalement, faire évoluer le comportement global du système.Pour faire face à ces nouveaux enjeux, de nouveaux processus et méthodes doivent être développés afin d'exploiter au mieux l'ensemble des données disponibles. Tel est l'objectif de la fouille de données spatio-temporelles qui correspond à l'ensemble de techniques et méthodes qui permettent d'obtenir des connaissances utiles à partir de gros volumes de données spatio-temporelles. Cette thèse s'inscrit dans le cadre général de la fouille de données spatio-temporelles et l'extraction de motifs séquentiels. Plus précisément, deux méthodes génériques d'extraction de motifs sont proposées. La première permet d'extraire des motifs séquentiels incluant des caractéristiques spatiales. Dans la deuxième, nous proposons un nouveau type de motifs appelé "motifs spatio-séquentiels". Ce type de motifs permet d'étudier l'évolution d'un ensemble d'événements décrivant une zone et son entourage proche. Ces deux approches ont été testées sur deux jeux de données associées à des phénomènes spatio-temporels : la pollution des rivières en France et le suivi épidémiologique de la dengue en Nouvelle Calédonie. Par ailleurs, deux mesures de qualité ainsi qu'un prototype de visualisation de motifs sont été également proposés pour accompagner les experts dans la sélection des motifs d'intérêts. / Thanks to the new technologies (smartphones, sensors, etc.), large amounts of spatiotemporal data are now available. The associated database can be called spatiotemporal databases because each row is described by a spatial information (e.g. a city, a neighborhood, a river, etc.) and temporal information (e.g. the date of an event). This huge data is often complex and heterogeneous and generates new needs in knowledge extraction methods to deal with these constraints (e.g. follow phenomena in time and space).Many phenomena with complex dynamics are thus associated with spatiotemporal data. For instance, the dynamics of an infectious disease can be described as the interactions between humans and the transmission vector as well as some spatiotemporal mechanisms involved in its development. The modification of one of these components can trigger changes in the interactions between the components and finally develop the overall system behavior.To deal with these new challenges, new processes and methods must be developed to manage all available data. In this context, the spatiotemporal data mining is define as a set of techniques and methods used to obtain useful information from large volumes of spatiotemporal data. This thesis follows the general framework of spatiotemporal data mining and sequential pattern mining. More specifically, two generic methods of pattern mining are proposed. The first one allows us to extract sequential patterns including spatial characteristics of data. In the second one, we propose a new type of patterns called spatio-sequential patterns. This kind of patterns is used to study the evolution of a set of events describing an area and its near environment.Both approaches were tested on real datasets associated to two spatiotemporal phenomena: the pollution of rivers in France and the epidemiological monitoring of dengue in New Caledonia. In addition, two measures of quality and a patterns visualization prototype are also available to assist the experts in the selection of interesting patters.
50

Fuzzy Association Rule Mining From Spatio-temporal Data: An Analysis Of Meteorological Data In Turkey

Unal Calargun, Seda 01 January 2008 (has links) (PDF)
Data mining is the extraction of interesting non-trivial, implicit, previously unknown and potentially useful information or patterns from data in large databases. Association rule mining is a data mining method that seeks to discover associations among transactions encoded within a database. Data mining on spatio-temporal data takes into consideration the dynamics of spatially extended systems for which large amounts of spatial data exist, given that all real world spatial data exists in some temporal context. We need fuzzy sets in mining association rules from spatio-temporal databases since fuzzy sets handle the numerical data better by softening the sharp boundaries of data which models the uncertainty embedded in the meaning of data. In this thesis, fuzzy association rule mining is performed on spatio-temporal data using data cubes and Apriori algorithm. A methodology is developed for fuzzy spatio-temporal data cube construction. Besides the performance criteria interpretability, precision, utility, novelty, direct-to-the-point and visualization are defined to be the metrics for the comparison of association rule mining techniques. Fuzzy association rule mining using spatio-temporal data cubes and Apriori algorithm performed within the scope of this thesis are compared using these metrics. Real meteorological data (precipitation and temperature) for Turkey recorded between 1970 and 2007 are analyzed using data cube and Apriori algorithm in order to generate the fuzzy association rules.

Page generated in 0.0531 seconds