Global ETD Search

21	Mixture models for ROC curve and spatio-temporal clustering Cheam, Amay SM January 2016 (has links) Finite mixture models have had a profound impact on the history of statistics, contributing to modelling heterogeneous populations, generalizing distributional assumptions, and lately, presenting a convenient framework for classification and clustering. A novel approach, via Gaussian mixture distribution, is introduced for modelling receiver operating characteristic curves. The absence of a closed-form for a functional form leads to employing the Monte Carlo method. This approach performs excellently compared to the existing methods when applied to real data. In practice, the data are often non-normal, atypical, or skewed. It is apparent that non-Gaussian distributions be introduced in order to better fit these data. Two non-Gaussian mixtures, i.e., t distribution and skew t distribution, are proposed and applied to real data. A novel mixture is presented to cluster spatial and temporal data. The proposed model defines each mixture component as a mixture of autoregressive polynomial with logistic links. The new model performs significantly better compared to the most well known model-based clustering techniques when applied to real data. / Thesis / Doctor of Philosophy (PhD) Finite mixture models ROC curve Spatio-temporal data Functional data Model-based clustering EM algorithm
22	Designing podcast listening history visualizations on mobile screens : A design study investigating visual representations of temporal data Forsrup, Ben January 2020 (has links) As listening to podcasts have increased in most western countries, podcast applications need to become more exciting both content-wise, functionally, and visually as well as contain original features, to differentiate them-selves from the competition. One such feature is data visualization, which is in this study argued to deliver additional value for users. This design study investigated the possibility of visually representing personal listening history from podcasts into a mobile application through focusing on the following question: How can visualization of podcast listening history give additional value to the user-experience on a mobile screen? Research on visualization of temporal data and user experience was used as the foundational theory, with additional information from state-of-the-art products. Using a variant of the design process, described in Design Study Methodology, a final design was developed iteratively, with focus groups and usability tests. After designing and implementing a proposed solution, usability tests were conducted remotely, using videos of a hi-fidelity prototype. Concluding the research, one finding is that a successful data visualization of podcast listening history on mobile screens should include only meaningful animations and interactions and separation between visual elements and filtering options. To fully understand the best implementation strategy, more refined and expansive studies are required, with more test participants. User tests in this study were limited due to the Covid-19 outbreak. / Då podcastlyssnandet har ökat i västvärlden behöver podcastmobilapplikationer differentiera sig mot konkurrensen genom att erbjuda ett spännande utbud av podcasts och unika funktionaliteter. I denna studie anses visualisering vara en sådan funktionalitet. Visualisering anses även kunna bidra med ytterligare uppskattning för användaren. Denna designstudie ämnar studera den visuella representationen av lyssningshistorik från podcasts i en mobilapplikation, genom att besvara följande forskningsfråga: Hur kan visualisering av lyssningshistorik från podcasts ge ytterligare värde för användarupplevelsen i en mobilapplikation? Den grundläggande teorin bakom studien består av tidigare forskning om visualisering av tidsrelaterade data samt information från moderna digitala produkter från relevanta ämnen. En slutgiltig design togs iterativt fram med hjälp av en variant av metodiken från Design Study Methodology, som bestod utav fokusgrupper och användarstudier, bland annat. Efter design och implementation av en prototyp undersöktes den med användarstudier. Dessa var gjorda på distans med hjälp utav videoklipp av olika funktionaliteter i prototypen. Resultat från studien visar på att en lyckad datavisualisering av lyssningshistorik från podcasts i mobilapplikationer bör endast inkludera meningsfulla animationer och interaktioner, samt bör man separera visuella komponenter med filtreringsalternativ. En mer expansiv och noga planerad studie med fler testdeltagare behövs för att säkerställa den bästa strategin för en lyckad implementering av de föreslagna visuella metoder som ingick i prototypen. Användarstudien som genomfördes begränsades av Covid-19 pandemin. Media and Communication Technology Medieteknik
23	A machine learning based spatio-temporal data mining approach for coastal remote sensing data Gokaraju, Balakrishna 07 August 2010 (has links) Continuous monitoring of coastal ecosystems aids in better understanding of their dynamics and inherent harmful effects. As many of these ecosystems prevail over space and time, there is a need for mining this spatio-temporal information for building accurate monitoring and forecast systems. Harmful Algal Blooms (HABs) pose an enormous threat to the U.S. marine habitation and economy in the coastal waters. Federal and state coastal administrators have been devising a state-of-the-art monitoring and forecasting systems for these HAB events. The efficacy of a monitoring and forecasting system relies on the performance of HAB detection. A Machine Learning based Spatio-Temporal data mining approach for the detection of HAB (STML-HAB) events in the region of Gulf of Mexico is proposed in this work. The spatio-temporal cubical neighborhood around the training sample is considered to retrieve relevant spectral information pertaining to both HAB and Non-HAB classes. A unique relevant feature subset combination is derived through evolutionary computation technique towards better classification of HAB from Non-HAB. Kernel based feature transformation and classification is used in developing the model. STML-HAB model gave significant performance improvements over the current optical detection based techniques by highly reducing the false alarm rate with an accuracy of 0.9642 on SeaWiFS data. The developed model is used for prediction on new datasets for further spatio-temporal analyses such as the seasonal variations of HAB, and sequential occurrence of algal blooms. New variability visualizations are introduced to illustrate the dynamic behavior and seasonal variations of HABs from large spatiotemporal datasets. The results outperformed the ensemble of the currently available empirical methods for HAB detection. The ensemble method is implemented by a new approach for combining the empirical models using a probabilistic neural network model. The model is also compared with the results obtained using various feature extraction techniques, spatial neighborhoods and classifiers. Machine Learning Spatio-Temporal Data Mining Support Vector Machines Kernel Methods
24	New Procedures for Data Mining and Measurement Error Models with Medical Imaging Applications Wang, Xiaofeng 15 July 2005 (has links) No description available. Statistics Spatial-temporal data Medical imaging Registration Smoothing Measurement error models Deconvolution Semiparametrics
25	The Evolution of Urban-Rural Space Olson, Jeffrey L. January 2013 (has links) No description available. Geography
26	Multiple Uses of Frequent Episodes in Temporal Process Modeling Patnaik, Debprakash 19 August 2011 (has links) This dissertation investigates algorithmic techniques for temporal process discovery in many domains. Many different formalisms have been proposed for modeling temporal processes such as motifs, dynamic Bayesian networks and partial orders, but the direct inference of such models from data has been computationally intensive or even intractable. In this work, we propose the mining of frequent episodes as a bridge to inferring more formal models of temporal processes. This enables us to combine the advantages of frequent episode mining, which conducts level wise search over constrained spaces, with the formal basis of process representations, such as probabilistic graphical models and partial orders. We also investigate the mining of frequent episodes in infinite data streams which further expands their applicability into many modern data mining contexts. To demonstrate the usefulness of our methods, we apply them in different problem contexts such as: sensor networks in data centers, multi-neuronal spike train analysis in neuroscience, and electronic medical records in medical informatics. / Ph. D. motifs graphical models frequent episodes dynamic Bayesian networks temporal data mining
27	FlockViz: A Visualization Technique to Facilitate Multi-dimensional Analytics of Spatio-temporal Cluster Data Hossain, Mohammad Zahid 26 May 2014 (has links) Visual analytics of large amounts of spatio-temporal data is challenging due to the overlap and clutter from movements of multiple objects. A common approach for analyzing such data is to consider how groups of items cluster and move together in space and time. However, most methods for showing Spatio-temporal Cluster (STC) properties, concentrate on a few dimensions of the cluster (e.g. the cluster movement direction or cluster density) and many other properties are not represented. Furthermore, while representing multiple attributes of clusters in a single view existing methods fail to preserve the original shape of the cluster or distort the actual spatial covering of the dataset. In this thesis, I propose a simple yet effective visualization, FlockViz, for showing multiple STC data dimensions in a single view by preserving the original cluster shape. To evaluate this method I develop a framework for categorizing the wide range of tasks involved in analyzing STCs. I conclude this work through a controlled user study comparing the performance of FlockViz with alternative visualization techniques that aid with cluster-based analytic tasks. Finally the exploration capability of FlockViz is demonstrated in some real life data sets such as fish movement, caribou movement, eagle migration, and hurricane movement. The results of the user studies and use cases confirm the advantage and novelty of the novel FlockViz design for visual analytic tasks. Data Visualization Spatio-temporal data Cluster visualization Multi-Data analysis
28	Exctraction de chroniques discriminantes / Discriminant chronicle mining Dauxais, Yann 13 April 2018 (has links) De nombreuses données sont enregistrées dans le cadre d'applications variées et leur analyse est un challenge abordé par de nombreuses études. Parmi ces différentes applications, cette thèse est motivée par l'analyse de parcours patients pour mener des études de pharmaco-épidémiologie. La pharmaco-épidémiologie est l'étude des usages et effets de produits de santé au sein de populations définies. Le but est donc d'automatiser ce type d'étude en analysant des données. Parmi les méthodes d'analyses de données, les approches d'extraction de motifs extraient des descriptions de comportements, appelées motifs, caractérisant ces données. L'intérêt principal de telles approches est de donner un aperçu des comportements décrivant les données. Dans cette thèse, nous nous intéressons à l'extraction de motifs temporels discriminants au sein de séquences temporelles, c'est-à-dire une liste d'évènements datés. Les motifs temporels sont des motifs représentant des comportements par leur dimension temporelle. Les motifs discriminants sont des motifs représentant les comportements apparaissant uniquement pour une sous-population bien définie. Alors que les motifs temporels sont essentiels pour décrire des données temporelles et que les motifs discriminants le sont pour décrire des différences de comportement, les motifs temporels discriminants ne sont que peu étudiés. Dans cette thèse, le modèle de chronique discriminante est proposé pour combler le manque d'approches d'extraction de motifs temporels discriminants. Une chronique est un motif temporelle représentable sous forme de graphe dont les nœuds sont des évènements et les arêtes sont des contraintes temporelles numériques. Le modèle de chronique a été choisi pour son expressivité concernant la dimension temporelle. Les chroniques discriminantes sont, de ce fait, les seuls motifs temporels discriminants représentant numériquement l'information temporelle. Les contributions de cette thèse sont : (i) un algorithme d'extraction de chroniques discriminantes (DCM), (ii) l'étude de l'interprétabilité du modèle de chronique au travers de sa généralisation et (iii) l'application de DCM sur des données de pharmaco-épidémiologie. L'algorithme DCM est dédié à l'extraction de chroniques discriminantes et basé sur l'algorithme d'extraction de règles numériques Ripperk . Utiliser Ripperk permet de tirer avantage de son efficacité et de son heuristique incomplète évitant la génération de motifs redondants. La généralisation de cet algorithme permet de remplacer Ripperk par n'importe quel algorithme de machine learning. Les motifs extraits ne sont donc plus forcément des chroniques mais une forme généralisée de celles-ci. Un algorithme de machine learning plus expressif extrait des chroniques généralisées plus expressives mais impacte négativement leur interprétabilité. Le compromis entre ce gain en expressivité, évalué au travers de la précision de classification, et cette perte d'interprétabilité, est comparé pour plusieurs types de chroniques généralisées. L'intérêt des chroniques discriminantes à représenter des comportements et l'efficacité de DCM est validée sur des données réelles et synthétiques dans le contexte de classification à base de motifs. Des chroniques ont finalement été extraites à partir des données de pharmaco-épidémiologie et présentées aux cliniciens. Ces derniers ont validés l'intérêt de celles-ci pour décrire des comportements d'épidémiologie discriminants. / Data are recorded for a wide range of application and their analysis is a great challenge addressed by many studies. Among these applications, this thesis was motivated by analyzing care pathway data to conduct pharmaco-epidemiological studies. Pharmaco-epidemiology is the study of the uses and effects of healthcare products in well defined populations. The goal is then to automate this study by analyzing data. Within the data analysis approaches, pattern mining approaches extract behavior descriptions, called patterns, characterizing the data. Patterns are often easily interpretable and give insights about hidden behaviors described by the data. In this thesis, we are interested in mining discriminant temporal patterns from temporal sequences, i.e. a list of timestamped events. Temporal patterns represent expressively behaviors through their temporal dimension. Discriminant patterns are suitable adapted for representing behaviors occurring specifically in small subsets of a whole population. Surprisingly, if temporal patterns are essential to describe timestamped data and discriminant patterns are crucial to identify alternative behaviors that differ from mainstream, discriminant temporal patterns received little attention up to now. In this thesis, the model of discriminant chronicles is proposed to address the lack of interest in discriminant temporal pattern mining approaches. A chronicle is a temporal pattern representable as a graph whose nodes are events and vertices are numerical temporal constraints. The chronicle model was choosen because of its high expressiveness when dealing with temporal sequences and also by its unique ability to describe numerically the temporal dimension among other discriminant pattern models. The contribution of this thesis, centered on the discriminant chronicle model, is threefold: (i) a discriminant chronicle model mining algorithm (DCM), (ii) the study of the discriminant chronicle model interpretability through its generalization and (iii) the DCM application on a pharmaco-epidemiology case study. The DCM algorithm is an efficient algorithm dedicated to extract discriminant chronicles and based on the Ripperk numerical rule learning algorithm. Using Ripperk allows to take advantage to its efficiency and its incomplete heuristic dedicated to avoid redundant patterns. The DCM generalization allows to swap Ripperk with alternative machine learning algorithms. The extracted patterns are not chronicles but a generalized form of chronicles. More expressive machine learning algorithms extract more expressive generalized chronicles but impact negatively their interpretability. The trade-off between this expressiveness gain, evaluated by classification accuracy, and this interpretability loss, is compared for several types of generalized chronicles. The interest of the discriminant chronicle model and the DCM efficiency is validated on synthetic and real datasets in pattern-based classification context. Finally, chronicles are extracted from a pharmaco-epidemiology dataset and presented to clinicians who validated them to be interesting to describe epidemiological behaviors. Fouille de données Données temporelles Extraction de motifs temporels Apprentissage supervisé Data mining Temporal data Temporal pattern mining Supervised machine learning
29	Représentations parcimonieuses et apprentissage de dictionnaires pour la classification et le clustering de séries temporelles / Time warp invariant sparse coding and dictionary learning for time series classification and clustering Varasteh Yazdi, Saeed 15 November 2018 (has links) L'apprentissage de dictionnaires à partir de données temporelles est un problème fondamental pour l’extraction de caractéristiques temporelles latentes, la révélation de primitives saillantes et la représentation de données temporelles complexes. Cette thèse porte sur l’apprentissage de dictionnaires pour la représentation parcimonieuse de séries temporelles. On s’intéresse à l’apprentissage de représentations pour la reconstruction, la classification et le clustering de séries temporelles sous des transformations de distortions temporelles. Nous proposons de nouveaux modèles invariants aux distortions temporelles.La première partie du travail porte sur l’apprentissage de dictionnaire pour des tâches de reconstruction et de classification de séries temporelles. Nous avons proposé un modèle TWI-OMP (Time-Warp Invariant Orthogonal Matching Pursuit) invariant aux distorsions temporelles, basé sur un opérateur de maximisation du cosinus entre des séries temporelles. Nous avons ensuite introduit le concept d’atomes jumelés (sibling atomes) et avons proposé une approche d’apprentissage de dictionnaires TWI-kSVD étendant la méthode kSVD à des séries temporelles.Dans la seconde partie du travail, nous nous sommes intéressés à l’apprentissage de dictionnaires pour le clustering de séries temporelles. Nous avons proposé une formalisation du problème et une solution TWI-DLCLUST par descente de gradient.Les modèles proposés sont évalués au travers plusieurs jeux de données publiques et réelles puis comparés aux approches majeures de l’état de l’art. Les expériences conduites et les résultats obtenus montrent l’intérêt des modèles d’apprentissage de représentations proposés pour la classification et le clustering de séries temporelles. / Learning dictionary for sparse representing time series is an important issue to extract latent temporal features, reveal salient primitives and sparsely represent complex temporal data. This thesis addresses the sparse coding and dictionary learning problem for time series classification and clustering under time warp. For that, we propose a time warp invariant sparse coding and dictionary learning framework where both input samples and atoms define time series of different lengths that involve varying delays.In the first part, we formalize an L0 sparse coding problem and propose a time warp invariant orthogonal matching pursuit based on a new cosine maximization time warp operator. For the dictionary learning stage, a non linear time warp invariant kSVD (TWI-kSVD) is proposed. Thanks to a rotation transformation between each atom and its sibling atoms, a singular value decomposition is used to jointly approximate the coefficients and update the dictionary, similar to the standard kSVD. In the second part, a time warp invariant dictionary learning for time series clustering is formalized and a gradient descent solution is proposed.The proposed methods are confronted to major shift invariant, convolved and kernel dictionary learning methods on several public and real temporal data. The conducted experiments show the potential of the proposed frameworks to efficiently sparse represent, classify and cluster time series under time warp. Apprentissage automatique Représentation parcimonieuse Séries temporelles Apprentissage de dictionnaire Machine learning Sparse coding Temporal data Dictionary learning 510
30	Programming Idioms and Runtime Mechanisms for Distributed Pervasive Computing Adhikari, Sameer 13 October 2004 (has links) The emergence of pervasive computing power and networking infrastructure is enabling new applications. Still, many milestones need to be reached before pervasive computing becomes an integral part of our lives. An important missing piece is the middleware that allows developers to easily create interesting pervasive computing applications. This dissertation explores the middleware needs of distributed pervasive applications. The main contributions of this thesis are the design, implementation, and evaluation of two systems: D-Stampede and Crest. D-Stampede allows pervasive applications to access live stream data from multiple sources using time as an index. Crest allows applications to organize historical events, and to reason about them using time, location, and identity. Together they meet the important needs of pervasive computing applications. D-Stampede supports a computational model called the thread-channel graph. The threads map to computing devices ranging from small to high-end processing elements. Channels serve as the conduits among the threads, specifically tuned to handle time-sequenced streaming data. D-Stampede allows the dynamic creation of threads and channels, and for the dynamic establishment (and removal) of the plumbing among them. The Crest system assumes a universe that consists of participation servers and event stores, supporting a set of applications. Each application consists of distributed software entities working together. The participation server helps the application entities to discover each other for interaction purposes. Application entities can generate events, store them at an event store, and correlate events. The entities can communicate with one another directly, or indirectly through the event store. We have qualitatively and quantitatively evaluated D-Stampede and Crest. The qualitative aspect refers to the ease of programming afforded by our programming abstractions for pervasive applications. The quantitative aspect measures the cost of the API calls, and the performance of an application pipeline that uses the systems. Event reasoning Temporal data Middleware Runtime system Pervasive computing Distributed computing Ubiquitous computing Middleware

Search results