Global ETD Search

31	Latent Feature Models for Uncovering Human Mobility Patterns from Anonymized User Location Traces with Metadata Alharbi, Basma Mohammed 10 April 2017 (has links) In the mobile era, data capturing individuals’ locations have become unprecedentedly available. Data from Location-Based Social Networks is one example of large-scale user-location data. Such data provide a valuable source for understanding patterns governing human mobility, and thus enable a wide range of research. However, mining and utilizing raw user-location data is a challenging task. This is mainly due to the sparsity of data (at the user level), the imbalance of data with power-law users and locations check-ins degree (at the global level), and more importantly the lack of a uniform low-dimensional feature space describing users. Three latent feature models are proposed in this dissertation. Each proposed model takes as an input a collection of user-location check-ins, and outputs a new representation space for users and locations respectively. To avoid invading users privacy, the proposed models are designed to learn from anonymized location data where only IDs - not geophysical positioning or category - of locations are utilized. To enrich the inferred mobility patterns, the proposed models incorporate metadata, often associated with user-location data, into the inference process. In this dissertation, two types of metadata are utilized to enrich the inferred patterns, timestamps and social ties. Time adds context to the inferred patterns, while social ties amplifies incomplete user-location check-ins. The first proposed model incorporates timestamps by learning from collections of users’ locations sharing the same discretized time. The second proposed model also incorporates time into the learning model, yet takes a further step by considering time at different scales (hour of a day, day of a week, month, and so on). This change in modeling time allows for capturing meaningful patterns over different times scales. The last proposed model incorporates social ties into the learning process to compensate for inactive users who contribute a large volume of incomplete user-location check-ins. To assess the quality of the new representation spaces for each model, evaluation is done using an external application, social link prediction, in addition to case studies and analysis of inferred patterns. Each proposed model is compared to baseline models, where results show significant improvements. mobility pattern inference graphical models mixed-membership models Representation learning
32	Simultaneous Measurement Imputation and Rehabilitation Outcome Prediction for Achilles Tendon Rupture Hamesse, Charles January 2018 (has links) Achilles tendonbrott (Achilles Tendon Rupture, ATR) är en av de typiska mjukvävnadsskadorna. Rehabilitering efter sådana muskuloskeletala skador förblir en långvarig process med ett mycket variet resultat. Att kunna förutsäga rehabiliteringsresultat exakt är avgörande för beslutsfattande stöduppdrag. I detta arbete designar vi en probabilistisk modell för att förutse rehabiliteringsresultat för ATR med hjälp av en klinisk kohort med många saknade poster. Vår modell är tränad från början till slutet för att samtidigt förutsäga de saknade inmatningarna och rehabiliteringsresultat. Vi utvärderar vår modell och jämför med flera baslinjer, inklusive flerstegsmetoder. Experimentella resultat visar överlägsenheten hos vår modell över dessa flerstadiga tillvägagångssätt med olika dataimuleringsmetoder för ATR rehabiliterings utfalls prognos. / Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries. Rehabilitation after such musculoskeletal injuries remains a prolonged process with a very variable outcome. Being able to predict the rehabilitation outcome accurately is crucial for treatment decision support. In this work, we design a probabilistic model to predict the rehabilitation outcome for ATR using a clinical cohort with numerous missing entries. Our model is trained end-to-end in order to simultaneously predict the missing entries and the rehabilitation outcome. We evaluate our model and compare with multiple baselines, including multi-stage methods. Experimental results demonstrate the superiority of our model over these baseline multi-stage approaches with various data imputation methods for ATR rehabilitation outcome prediction. Machine learning Probabilistic graphical models Healthcare Computer Systems Datorsystem
33	Extreme-Value Models and Graphical Methods for Spatial Wildfire Risk Assessment Cisneros, Daniela 11 September 2023 (has links) The statistical modeling of spatial extreme events, augmented by graphical models, provides a comprehensive framework for the development of techniques and models to describe natural phenomena in a variety of environmental, geoscience, and climate science applications. In a changing climate, the impact of natural hazards, such as wildfires, is believed to have evolved in frequency, size, and spatial extent, although regional responses may vary. The aforementioned impacts are of great significance due to their association with air pollution, irreversible harm to the environment and atmosphere, and the fact that they put human lives at risk. The prediction of wildfires holds significant importance within the realm of wildfire management due to its influence on the allocation of resources, the mitigation of detrimental consequences, and the subsequent recovery endeavors. Therefore, the development of robust statistical methodologies that can accurately forecast extreme wildfire occurrences across spatial and temporal dimensions is of great significance. In this thesis, we develop new spatial statistical models, combined with popular machine learning techniques, as well as novel extreme-value methods to enhance the prediction of wildfire risk using graphical models. First, in order to jointly efficiently model high-dimensional wildfire counts and burnt areas over the whole continguous United States, we propose a four-stage zero-inflated bivariate spatiotemporal model combining low-rank spatial models and random forests. Second, to model high values of the McArthur Forest Fire Danger Index over Australia, we develop a novel spatial extreme-value model based on mixtures of tree-based multivariate Pareto distributions. Our new methodology combines theoretically justified spatial extreme models with a computationally convenient graphical model framework to spatial problems in high dimensions efficiently. Third, we exploit recent advancements in deep learning and build a parametric regression model using graphic convolutional neural networks and the extended Generalized Pareto distribution, allow us to jointly model moderate and extreme wildfires observed on irregular spatial grid. We work with a novel dataset of Australian wildfires from 1999 to 2019, and analyse monthly spread over areas correspond to Statistical Area Level 1 regions. We highlight the efficacy of our newly proposed model and perform risk assessment for Australia and dense communities. Statistics of extremes Spatio-temporal statistics Graphical models Multivariate Extremes
34	Learning for Spoken Dialog Systems with Discriminative Graphical Models Ma, Yi January 2015 (has links) No description available. Computer Science spoken dialog systems discriminative graphical models
35	Modeling Evolutionary Constraints and Improving Multiple Sequence Alignments using Residue Couplings Hossain, K.S.M. Tozammel 16 November 2016 (has links) Residue coupling in protein families has received much attention as an important indicator toward predicting protein structures and revealing functional insight into proteins. Existing coupling methods identify largely pairwise couplings and express couplings over amino acid combinations, which do not yield a mechanistic explanation. Most of these methods primarily use a multiple protein sequence alignment---most likely a resultant alignment---which better exposes couplings and is obtained through manual tweaking of an alignment constructed by a classical alignment algorithm. Classical alignment algorithms primarily focus on capturing conservations and may not fully unveil couplings in the alignment. In this dissertation, we propose methods for capturing both pairwise and higher-order couplings in protein families. Our methods provide mechanistic explanations for couplings using physicochemical properties of amino acids and discernibility between orders. We also investigate a method for mining frequent episodes---called coupled patterns---in an alignment produced by a classical algorithm for proteins and for exploiting the coupled patterns for improving the alignment quality in terms of exposition of couplings. We demonstrate the effectiveness of our proposed methods on a large collection of sequence datasets for protein families. / Ph. D. / Proteins are biomolecules that comprise amino acid compounds. A chain of amino acid (a.k.a. protein sequence) forms the primary structure of a protein, and the shaping of this chain into various folds gives rise to a more complex 3D structure, a natural state of proteins. It is through structures protein performs various activities. To preserve these activities in proteins, evolution allows only those changes in protein sequences that do not disrupt the overall structures and functions of proteins. Coupling is a evolutionary phenomenon that helps proteins preserve their structures and functions. Two or more amino acid positions are coupled if changes of amino acids at a position is compensated by changes in the other position(s). In this thesis, we propose a set of probabilistic methods for modeling such couplings between two or more positions. Our methods identify the most probable couplings in a set of protein sequences and express them with probabilistic graphical models (a powerful and interpretable framework), which can be used for answering questions related to protein structures, functions, and protein synthesis. Using this notion of coupling, we also develop a method for improving the quality of multiple protein sequence alignment, a widely used tool for protein sequence analyses. We evaluate our methods with a large collection of sequence datasets for protein families, and the results substantiate the efficacy of our methods. residue coupling multiple sequence alignment graphical models pattern set mining
36	Modelling of extremes Hitz, Adrien January 2016 (has links) This work focuses on statistical methods to understand how frequently rare events occur and what the magnitude of extreme values such as large losses is. It lies in a field called extreme value analysis whose scope is to provide support for scientific decision making when extreme observations are of particular importance such as in environmental applications, insurance and finance. In the univariate case, I propose new techniques to model tails of discrete distributions and illustrate them in an application on word frequency and multiple birth data. Suitably rescaled, the limiting tails of some discrete distributions are shown to converge to a discrete generalized Pareto distribution and generalized Zipf distribution respectively. In the multivariate high-dimensional case, I suggest modeling tail dependence between random variables by a graph such that its nodes correspond to the variables and shocks propagate through the edges. Relying on the ideas of graphical models, I prove that if the variables satisfy a new notion called asymptotic conditional independence, then the density of the joint distribution can be simplified and expressed in terms of lower dimensional functions. This generalizes the Hammersley- Clifford theorem and enables us to infer tail distributions from observations in reduced dimension. As an illustration, extreme river flows are modeled by a tree graphical model whose structure appears to recover almost exactly the actual river network. A fundamental concept when studying limiting tail distributions is regular variation. I propose a new notion in the multivariate case called one-component regular variation, of which Karamata's and the representation theorem, two important results in the univariate case, are generalizations. Eventually, I turn my attention to website visit data and fit a censored copula Gaussian graphical model allowing the visualization of users' behavior by a graph.
37	Polyhedral Problems in Combinatorial Convex Geometry Solus, Liam 01 January 2015 (has links) In this dissertation, we exhibit two instances of polyhedra in combinatorial convex geometry. The first instance arises in the context of Ehrhart theory, and the polyhedra are the central objects of study. The second instance arises in algebraic statistics, and the polyhedra act as a conduit through which we study a nonpolyhedral problem. In the first case, we examine combinatorial and algebraic properties of the Ehrhart h-polynomial of the r-stable (n,k)-hypersimplices. These are a family of polytopes which form a nested chain of subpolytopes within the (n,k)-hypersimplex. We show that a well-studied unimodular triangulation of the (n,k)-hypersimplex restricts to a triangulation of each r-stable (n,k)-hypersimplex within. We then use this triangulation to compute the facet-defining inequalities of these polytopes. In the k=2 case, we use shelling techniques to devise a combinatorial interpretation of the coefficients of the h-polynomials in terms of independent sets of certain graphs. From this, we then extract some results on unimodality. We also characterize the Gorenstein r-stable (n,k)-hypersimplices, and we conclude that these also have unimodal h*-polynomials. In the second case, for a graph G on p vertices we consider the closure of the cone of concentration matrices of G. The extreme rays of this cone, and their associated ranks, have applications in maximum likelihood estimation for the undirected Gaussian graphical model associated to G. Consequently, the extreme ranks of this cone have been well-studied. Yet, there are few graph classes for which all the possible extreme ranks are known. We show that the facet-normals of the cut polytope of G can serve to identify extreme rays of this nonpolyhedral cone. We see that for graphs without K5 minors each facet-normal of the cut polytope identifies an extreme ray in the cone, and we determine the rank of this extreme ray. When the graph is also series-parallel, we find that all possible extreme ranks arise in this fashion, thereby extending the collection of graph classes for which all the possible extreme ranks are known. r-stable hypersimplices hypersimplices cut polytope graphical models facet-ray identification Discrete Mathematics and Combinatorics
38	Μηχανική μάθηση : Bayesian δίκτυα και εφαρμογές Χριστακοπούλου, Κωνσταντίνα 13 October 2013 (has links) Στην παρούσα διπλωματική εργασία πραγματευόμαστε το θέμα της χρήσης των Bayesian Δικτύων -και γενικότερα των Πιθανοτικών Γραφικών Μοντέλων - στη Μηχανική Μάθηση. Στα πρώτα κεφάλαια της εργασίας αυτής παρουσιάζουμε συνοπτικά τη θεωρητική θεμελίωση αυτών των δομημένων πιθανοτικών μοντέλων, η οποία απαρτίζεται από τις βασικές φάσεις της αναπαράστασης, επαγωγής συμπερασμάτων, λήψης αποφάσεων και εκμάθησης από τα διαθέσιμα δεδομένα. Στα επόμενα κεφάλαια, εξετάζουμε ένα ευρύ φάσμα εφαρμογών των πιθανοτικών γραφικών μοντέλων και παρουσιάζουμε τα αποτελέσματα των εξομοιώσεων που υλοποιήσαμε. Συγκεκριμένα, αρχικά με χρήση γράφων ορίζονται τα Bayesian δίκτυα, Markov δίκτυα και Factor Graphs. Έπειτα, παρουσιάζονται οι αλγόριθμοι επαγωγής συμπερασμάτων που επιτρέπουν τον απευθείας υπολογισμό πιθανοτικών κατανομών από τους γράφους. Διευκολύνεται η λήψη αποφάσεων υπό αβεβαιότητα με τα δέντρα αποφάσεων και τα Influence διαγράμματα. Ακολούθως, μελετάται η εκμάθηση της δομής και των παραμέτρων των πιθανοτικών γραφικών μοντέλων σε παρουσία πλήρους ή μερικού συνόλου δεδομένων. Τέλος, παρουσιάζονται εκτενώς σενάρια τα οποία καταδεικνύουν την εκφραστική δύναμη, την ευελιξία και τη χρηστικότητα των Πιθανοτικών Γραφικών Μοντέλων σε εφαρμογές του πραγματικού κόσμου. / The main subject of this diploma thesis is how probabilistic graphical models can be used in a wide range of real-world scenarios. In the first chapters, we have presented in a concise way the theoretical foundations of graphical models, which consists of the deeply related phases of representation, inference, decision theory and learning from data. In the next chapters, we have worked on many applications, from Optical Character Recognition to Recoginizing Actions and we have presented the results from the simulations. Μηχανική μάθηση Bayesian δίκτυα 006.31 Machine learning Bayesian networks Probabilistic graphical models
39	A Python implementation of graphical models Gouws, Almero 03 1900 (has links) Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: In this thesis we present GrMPy, a library of classes and functions implemented in Python, designed for implementing graphical models. GrMPy supports both undirected and directed models, exact and approximate probabilistic inference, and parameter estimation from complete and incomplete data. In this thesis we outline the necessary theory required to understand the tools implemented within GrMPy as well as provide pseudo-code algorithms that illustrate how GrMPy is implemented. / AFRIKAANSE OPSOMMING: In hierdie verhandeling bied ons GrMPy aan,'n biblioteek van klasse en funksies wat Python geim- plimenteer word en ontwerp is vir die implimentering van grafiese modelle. GrMPy ondersteun beide gerigte en ongerigte modelle, presies eenbenaderde moontlike gevolgtrekkings en parameterskat- tings van volledige en onvolledige inligting. In hierdie verhandeling beskryf ons die nodige teorie wat benodig word om die hulpmiddels wat binne GrMPy geimplimenteer word te verstaan sowel as die pseudo-kodealgoritmes wat illustreer hoe GrMPy geimplimenteer is. Graphical models Bayesian networks Markov random fields Dissertations -- Electronic engineering Theses -- Electronic engineering GrMPy
40	Nonparametric Discovery of Human Behavior Patterns from Multimodal Data Sun, Feng-Tso 01 May 2014 (has links) Recent advances in sensor technologies and the growing interest in context- aware applications, such as targeted advertising and location-based services, have led to a demand for understanding human behavior patterns from sensor data. People engage in routine behaviors. Automatic routine discovery goes beyond low-level activity recognition such as sitting or standing and analyzes human behaviors at a higher level (e.g., commuting to work). The goal of the research presented in this thesis is to automatically discover high-level semantic human routines from low-level sensor streams. One recent line of research is to mine human routines from sensor data using parametric topic models. The main shortcoming of parametric models is that they assume a fixed, pre-specified parameter regardless of the data. Choosing an appropriate parameter usually requires an inefficient trial-and-error model selection process. Furthermore, it is even more difficult to find optimal parameter values in advance for personalized applications. The research presented in this thesis offers a novel nonparametric framework for human routine discovery that can infer high-level routines without knowing the number of latent low-level activities beforehand. More specifically, the frame-work automatically finds the size of the low-level feature vocabulary from sensor feature vectors at the vocabulary extraction phase. At the routine discovery phase, the framework further automatically selects the appropriate number of latent low-level activities and discovers latent routines. Moreover, we propose a new generative graphical model to incorporate multimodal sensor streams for the human activity discovery task. The hypothesis and approaches presented in this thesis are evaluated on public datasets in two routine domains: two daily-activity datasets and a transportation mode dataset. Experimental results show that our nonparametric framework can automatically learn the appropriate model parameters from multimodal sensor data without any form of manual model selection procedure and can outperform traditional parametric approaches for human routine discovery tasks. Activity recognition machine learning topic modeling nonparametric Bayesian probabilistic graphical models context-aware systems

Search results