Global ETD Search

1	Towards Algorithm Transformation for Temporal Data Mining on GPU Ponce, Sean Philip 18 August 2009 (has links) Data Mining allows one to analyze large amounts of data. With increasing amounts of data being collected, more computing power is needed to mine these larger and larger sums of data. The GPU is an excellent piece of hardware with a compelling price to performance ratio and has rapidly risen in popularity. However, this increase in speed comes at a cost. The GPU's architecture executes non-data parallel code with either marginal speedup or even slowdown. The type of data mining we examine, temporal data mining, uses a Â¯nite state machine (FSM), which is non-data parallel. We contribute the concept of algorithm transformation for increasing the data parallelism of an algorithm. We apply the algorithm transformation process to the problem of temporal data mining which solves the same problem as the FSM-based algorithm, but is data parallel. The new GPU implementation shows a 6x speedup over the best CPU implementation and 11x speedup over a previous GPU implementation. / Master of Science CUDA GPGPU temporal data mining
2	Spatio-Temporal Data Mining for Location-Based Services Gidofalvi, Gyözö January 2008 (has links) Largely driven by advances in communication and information technology, such as the increasing availability and accuracy of GPS technology and the miniaturization of wireless communication devices, Location–Based Services (LBS) are continuously gaining popularity. Innovative LBSes integrate knowledge about the users into the service. Such knowledge can be derived by analyzing the location data of users. Such data contain two unique dimensions, space and time, which need to be analyzed. The objectives of this thesis are three–fold. First, to extend popular data mining methods to the spatio–temporal domain. Second, to demonstrate the usefulness of the extended methods and the derived knowledge in two promising LBS examples. Finally, to eliminate privacy concerns in connection with spatio–temporal data mining by devising systems for privacy–preserving location data collection and mining. To this extent, Chapter 2 presents a general methodology, pivoting, to extend a popular data mining method, namely rule mining, to the spatio–temporal domain. By considering the characteristics of a number of real–world data sources, Chapter 2 also derives a taxonomy of spatio–temporal data, and demonstrates the usefulness of the rules that the extended spatio–temporal rule mining method can discover. In Chapter 4 the proposed spatio–temporal extension is applied to find long, sharable patterns in trajectories of moving objects. Empirical evaluations show that the extended method and its variants, using high–level SQL implementations, are effective tools for analyzing trajectories of moving objects. Real–world trajectory data about a large population of objects moving over extended periods within a limited geographical space is difficult to obtain. To aid the development in spatio–temporal data management and data mining, Chapter 3 develops a Spatio–Temporal ACTivity Simulator (ST–ACTS). ST–ACTS uses a number of real–world geo–statistical data sources and intuitive principles to effectively generate realistic spatio–temporal activities of mobile users. Chapter 5 proposes an LBS in the transportation domain, namely cab–sharing. To deliver an effective service, a unique spatio–temporal grouping algorithm is presented and implemented as a sequence of SQL statements. Chapter 6 identifies ascalability bottleneck in the grouping algorithm. To eliminate the bottleneck, the chapter expresses the grouping algorithm as a continuous stream query in a data stream management system, and then devises simple but effective spatio–temporal partitioning methods for streams to parallelize the computation. Experimental results show that parallelization through adaptive partitioning methods leads to speed–ups of orders of magnitude without significantly effecting the quality of the grouping. Spatio–temporal stream partitioning is expected to be an effective method to scale computation–intensive spatial queries and spatial analysis methods for streams. Location–Based Advertising (LBA), the delivery of relevant commercial information to mobile consumers, is considered to be one of the most promising business opportunities amongst LBSes. To this extent, Chapter 7 describes an LBA framework and an LBA database that can be used for the management of mobile ads. Using a simulated but realistic mobile consumer population and a set of mobile ads, the LBA database is used to estimate the capacity of the mobile advertising channel. The estimates show that the channel capacity is extremely large, which is evidence for a strong business case, but it also necessitates adequate user controls. When data about users is collected and analyzed, privacy naturally becomes a concern. To eliminate the concerns, Chapter 8 first presents a grid–based framework in which location data is anonymized through spatio–temporal generalization, and then proposes a system for collecting and mining anonymous location data. Experimental results show that the privacy–preserving data mining component discovers patterns that, while probabilistic, are accurate enough to be useful for many LBSes. To eliminate any uncertainty in the mining results, Chapter 9 proposes a system for collecting exact trajectories of moving objects in a privacy–preserving manner. In the proposed system there are no trusted components and anonymization is performed by the clients in a P2P network via data cloaking and data swapping. Realistic simulations show that under reasonable conditions and privacy/anonymity settings the proposed system is effective. / QC 20120215
3	Pattern-Aware Prediction for Moving Objects Hoyoung Jeung Unknown Date (has links) This dissertation challenges an unstudied area in moving objects database domains; predicting (long-term) future locations of moving objects. Moving object prediction enables us to provide a wide range of applications, such as traffic prediction, pre-detection of an aircraft collision, and reporting attractive gas prices for drivers along their routes ahead. Nevertheless, existing location prediction techniques are limited to support such applications since they are generally capable only of short-term predictions. In the real world, many objects exhibit typical movement patterns. This pattern information is able to serve as an important background to tackle the limitations of the existing prediction methods. We aims at offering foundations of pattern-aware prediction for moving objects, rendering more precise prediction results. Specifically, this thesis focuses on three parts. The first part of the thesis studies the problem of predicting future locations of moving objects in Euclidean space. We introduce a novel prediction approach, termed the hybrid prediction model, which utilizes not only the current motion of an object, but also the object's trajectory patterns for prediction. We define, mine, and index the trajectory patterns with a novel access method for efficient query processing. We then propose two different query processing techniques along given query time, i.e., for near future and for distant future. The second part covers the prediction problem for moving objects in network space. We formulate a network mobility model that offers a concise representation of mobility statistics extracted from massive collections of historical objects trajectories. This model captures turning patterns of the objects at junctions, at the granularity of individual objects as well as globally. Based on the model, we develop three different algorithms for predicting the future path of a mobile user moving in a road network, named the PathPredictors. The third part of the thesis extends the prediction problem for a single object to that for multiple objects. We introduce a convoy query that retrieves all groups of objects, i.e., convoys, from the objects' historical trajectories, each convoy consists of objects that have traveled together for some time; thus they may also move together in the future. We then propose three efficient algorithms for the convoy discovery, called the CuTS family, that adopt line simplification methods for reducing the size of the trajectories, permitting efficient query processing. For each part, we demonstrate comprehensive experimental results of our proposals, which show significantly improved accuracies for moving object prediction compared with state-of-the-art methods, while also facilitating efficient query processing. Moving Objects Databases Predictive Query Processing Spatio-Temporal Data Mining
4	Pattern-Aware Prediction for Moving Objects Hoyoung Jeung Unknown Date (has links) This dissertation challenges an unstudied area in moving objects database domains; predicting (long-term) future locations of moving objects. Moving object prediction enables us to provide a wide range of applications, such as traffic prediction, pre-detection of an aircraft collision, and reporting attractive gas prices for drivers along their routes ahead. Nevertheless, existing location prediction techniques are limited to support such applications since they are generally capable only of short-term predictions. In the real world, many objects exhibit typical movement patterns. This pattern information is able to serve as an important background to tackle the limitations of the existing prediction methods. We aims at offering foundations of pattern-aware prediction for moving objects, rendering more precise prediction results. Specifically, this thesis focuses on three parts. The first part of the thesis studies the problem of predicting future locations of moving objects in Euclidean space. We introduce a novel prediction approach, termed the hybrid prediction model, which utilizes not only the current motion of an object, but also the object's trajectory patterns for prediction. We define, mine, and index the trajectory patterns with a novel access method for efficient query processing. We then propose two different query processing techniques along given query time, i.e., for near future and for distant future. The second part covers the prediction problem for moving objects in network space. We formulate a network mobility model that offers a concise representation of mobility statistics extracted from massive collections of historical objects trajectories. This model captures turning patterns of the objects at junctions, at the granularity of individual objects as well as globally. Based on the model, we develop three different algorithms for predicting the future path of a mobile user moving in a road network, named the PathPredictors. The third part of the thesis extends the prediction problem for a single object to that for multiple objects. We introduce a convoy query that retrieves all groups of objects, i.e., convoys, from the objects' historical trajectories, each convoy consists of objects that have traveled together for some time; thus they may also move together in the future. We then propose three efficient algorithms for the convoy discovery, called the CuTS family, that adopt line simplification methods for reducing the size of the trajectories, permitting efficient query processing. For each part, we demonstrate comprehensive experimental results of our proposals, which show significantly improved accuracies for moving object prediction compared with state-of-the-art methods, while also facilitating efficient query processing. Moving Objects Databases Predictive Query Processing Spatio-Temporal Data Mining
5	Pattern Extraction By Using Both Spatial And Temporal Features On Turkish Meteorological Data Goler, Isil 01 January 2011 (has links) (PDF) With the growth in the size of datasets, data mining has been an important research topic and is receiving substantial interest from both academia and industry for many years. Especially, spatio-temporal data mining, mining knowledge from large amounts of spatio-temporal data, is a highly demanding field because huge amounts of spatio-temporal data are collected in various applications. Therefore, spatio-temporal data mining requires the development of novel data mining algorithms and computational techniques for a successful analysis of large spatio-temporal databases. In this thesis, a spatio-temporal mining technique is proposed and applied on Turkish meteorological data which has been collected from various weather stations in Turkey. This study also includes an analysis and interpretation of spatio-temporal rules generated for Turkish Meteorological data set. We introduce a second level mining technique which is used to define general trends of the patterns according to the spatial changes. Genarated patterns are investigated under different temporal sets in order to monitor the changes of the events with respect to temporal changes. QA Computer Software 76.75-76.765
6	Temporal Mining for Distributed Systems Jiang, Yexi 23 March 2015 (has links) Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions. Data Mining Temporal Data Mining Computer Sciences Databases and Information Systems
7	Moving Object Trajectory Based Intelligent Traffic Information Hub Rui, Zhu January 2013 (has links) Congestion is a major problem in most metropolitan areas and given the increasingrate of urbanization it is likely to be an even more serious problem in the rapidlyexpanding mega cities. One possible method to combat congestion is to provide in-telligent trafﬁc management systems that can in a timely manner inform drivers aboutcurrent or predicted trafﬁc congestions that are relevant to them on their journeys. Thedetection of trafﬁc congestion and the determination of whom to send in advance no-tiﬁcations about the detected congestions is the objective of the present research. Byadopting a grid based discretization of space, the proposed system extracts and main-tains trafﬁc ﬂow statistics and mobility statistics from the grid based recent trajectoriesof moving objects, and captures periodical spatio-temporal changes in the trafﬁc ﬂowsand movements by managing statistics for relevant temporal domain projections, i.e.,hour-of-day and day-of-week. Then, the proposed system identiﬁes a directional con-gestion as a cell and its immediate neighbor, where the speed and ﬂow of the objectsthat have moved from the neighbor to the cell signiﬁcantly deviates from the histori-cal speed and ﬂow statistics. Subsequently, based on one of two notiﬁcation criteria,namely, Mobility Statistic Criterion (MSC) and Linear Movement Criterion (LMC),the system decides which objects are likely to be affected by the identiﬁed conges-tions and sends out notiﬁcations to the corresponding objects such that the numberof false negative (missed) and false positive (unnecessary) notiﬁcations is minimized.The thesis discusses the design and DBMS-based implementation of the proposedsystem. Empirical evaluations on realistically simulated trajectory data assess the ac-curacy of the methods and test the scalability of the system for varying input sizes andparameter settings. The accuracy assessment results show that the MSC based systemachieves an optimal performance with a true positive notiﬁcation rate of 0.67 and afalse positive notiﬁcation rate of 0.05 when min prob equals to 0.35, which is superiorto the performance of the LMC based system. The execution time of- and the spaceused by the system scales linearly with the input size (number of concurrently movingvehicles) and the methods mutually dependent parameters (grid resolution r and RTlength l) that jointly deﬁne a spatio-temporal resolution. Within the area of a large city (40km by 40km), assuming a 60km/h average vehicle speed, the system, runningon a commodity personal computer, can manage the described congestion detectionand three-minute-ahead notiﬁcation tasks within real-time requirements for 2000 and20000 concurrently moving vehicles for spatio-temporal resolutions (r=100m, l=19)and (r=2km, l=3), respectively. Spatio-temporal Data Mining Congestion Detection and Notification LBS Intelligent Transport Systems Information Systems
8	A machine learning based spatio-temporal data mining approach for coastal remote sensing data Gokaraju, Balakrishna 07 August 2010 (has links) Continuous monitoring of coastal ecosystems aids in better understanding of their dynamics and inherent harmful effects. As many of these ecosystems prevail over space and time, there is a need for mining this spatio-temporal information for building accurate monitoring and forecast systems. Harmful Algal Blooms (HABs) pose an enormous threat to the U.S. marine habitation and economy in the coastal waters. Federal and state coastal administrators have been devising a state-of-the-art monitoring and forecasting systems for these HAB events. The efficacy of a monitoring and forecasting system relies on the performance of HAB detection. A Machine Learning based Spatio-Temporal data mining approach for the detection of HAB (STML-HAB) events in the region of Gulf of Mexico is proposed in this work. The spatio-temporal cubical neighborhood around the training sample is considered to retrieve relevant spectral information pertaining to both HAB and Non-HAB classes. A unique relevant feature subset combination is derived through evolutionary computation technique towards better classification of HAB from Non-HAB. Kernel based feature transformation and classification is used in developing the model. STML-HAB model gave significant performance improvements over the current optical detection based techniques by highly reducing the false alarm rate with an accuracy of 0.9642 on SeaWiFS data. The developed model is used for prediction on new datasets for further spatio-temporal analyses such as the seasonal variations of HAB, and sequential occurrence of algal blooms. New variability visualizations are introduced to illustrate the dynamic behavior and seasonal variations of HABs from large spatiotemporal datasets. The results outperformed the ensemble of the currently available empirical methods for HAB detection. The ensemble method is implemented by a new approach for combining the empirical models using a probabilistic neural network model. The model is also compared with the results obtained using various feature extraction techniques, spatial neighborhoods and classifiers. Machine Learning Spatio-Temporal Data Mining Support Vector Machines Kernel Methods
9	Multiple Uses of Frequent Episodes in Temporal Process Modeling Patnaik, Debprakash 19 August 2011 (has links) This dissertation investigates algorithmic techniques for temporal process discovery in many domains. Many different formalisms have been proposed for modeling temporal processes such as motifs, dynamic Bayesian networks and partial orders, but the direct inference of such models from data has been computationally intensive or even intractable. In this work, we propose the mining of frequent episodes as a bridge to inferring more formal models of temporal processes. This enables us to combine the advantages of frequent episode mining, which conducts level wise search over constrained spaces, with the formal basis of process representations, such as probabilistic graphical models and partial orders. We also investigate the mining of frequent episodes in infinite data streams which further expands their applicability into many modern data mining contexts. To demonstrate the usefulness of our methods, we apply them in different problem contexts such as: sensor networks in data centers, multi-neuronal spike train analysis in neuroscience, and electronic medical records in medical informatics. / Ph. D. motifs graphical models frequent episodes dynamic Bayesian networks temporal data mining
10	Extraction de relations spatio-temporelles à partir des données environnementales et de la santé / Spatio-temporal data mining from health and environment data Alatrista-Salas, Hugo 04 October 2013 (has links) Face à l'explosion des nouvelles technologies (mobiles, capteurs, etc.), de grandes quantités de données localisées dans l'espace et dans le temps sont désormais disponibles. Les bases de données associées peuvent être qualifiées de bases de données spatio-temporelles car chaque donnée est décrite par une information spatiale (e.g. une ville, un quartier, une rivière, etc.) et temporelle (p. ex. la date d'un événement). Cette masse de données souvent hétérogènes et complexes génère ainsi de nouveaux besoins auxquels les méthodes d'extraction de connaissances doivent pouvoir répondre (e.g. suivre des phénomènes dans le temps et l'espace). De nombreux phénomènes avec des dynamiques complexes sont ainsi associés à des données spatio-temporelles. Par exemple, la dynamique d'une maladie infectieuse peut être décrite par les interactions entre les humains et le vecteur de transmission associé ainsi que par certains mécanismes spatio-temporels qui participent à son évolution. La modification de l'un des composants de ce système peut déclencher des variations dans les interactions entre les composants et finalement, faire évoluer le comportement global du système.Pour faire face à ces nouveaux enjeux, de nouveaux processus et méthodes doivent être développés afin d'exploiter au mieux l'ensemble des données disponibles. Tel est l'objectif de la fouille de données spatio-temporelles qui correspond à l'ensemble de techniques et méthodes qui permettent d'obtenir des connaissances utiles à partir de gros volumes de données spatio-temporelles. Cette thèse s'inscrit dans le cadre général de la fouille de données spatio-temporelles et l'extraction de motifs séquentiels. Plus précisément, deux méthodes génériques d'extraction de motifs sont proposées. La première permet d'extraire des motifs séquentiels incluant des caractéristiques spatiales. Dans la deuxième, nous proposons un nouveau type de motifs appelé "motifs spatio-séquentiels". Ce type de motifs permet d'étudier l'évolution d'un ensemble d'événements décrivant une zone et son entourage proche. Ces deux approches ont été testées sur deux jeux de données associées à des phénomènes spatio-temporels : la pollution des rivières en France et le suivi épidémiologique de la dengue en Nouvelle Calédonie. Par ailleurs, deux mesures de qualité ainsi qu'un prototype de visualisation de motifs sont été également proposés pour accompagner les experts dans la sélection des motifs d'intérêts. / Thanks to the new technologies (smartphones, sensors, etc.), large amounts of spatiotemporal data are now available. The associated database can be called spatiotemporal databases because each row is described by a spatial information (e.g. a city, a neighborhood, a river, etc.) and temporal information (e.g. the date of an event). This huge data is often complex and heterogeneous and generates new needs in knowledge extraction methods to deal with these constraints (e.g. follow phenomena in time and space).Many phenomena with complex dynamics are thus associated with spatiotemporal data. For instance, the dynamics of an infectious disease can be described as the interactions between humans and the transmission vector as well as some spatiotemporal mechanisms involved in its development. The modification of one of these components can trigger changes in the interactions between the components and finally develop the overall system behavior.To deal with these new challenges, new processes and methods must be developed to manage all available data. In this context, the spatiotemporal data mining is define as a set of techniques and methods used to obtain useful information from large volumes of spatiotemporal data. This thesis follows the general framework of spatiotemporal data mining and sequential pattern mining. More specifically, two generic methods of pattern mining are proposed. The first one allows us to extract sequential patterns including spatial characteristics of data. In the second one, we propose a new type of patterns called spatio-sequential patterns. This kind of patterns is used to study the evolution of a set of events describing an area and its near environment.Both approaches were tested on real datasets associated to two spatiotemporal phenomena: the pollution of rivers in France and the epidemiological monitoring of dengue in New Caledonia. In addition, two measures of quality and a patterns visualization prototype are also available to assist the experts in the selection of interesting patters. Fouille de données spatio-temporelles Information Géographique Recherche de corrélations Exploration de données Système de détection épidémiologique Spatio-temporal data mining Geographic information Research of correlations Data exploration Epidemiology detection system

Search results