Spelling suggestions: "subject:"[een] PROBABILISTIC GRAPHICAL MODELS"" "subject:"[enn] PROBABILISTIC GRAPHICAL MODELS""
1 |
Learning Deep Generative ModelsSalakhutdinov, Ruslan 02 March 2010 (has links)
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks. In addition, similar methods can be used for nonlinear dimensionality reduction.
|
2 |
Learning Deep Generative ModelsSalakhutdinov, Ruslan 02 March 2010 (has links)
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks. In addition, similar methods can be used for nonlinear dimensionality reduction.
|
3 |
Simultaneous Measurement Imputation and Rehabilitation Outcome Prediction for Achilles Tendon RuptureHamesse, Charles January 2018 (has links)
Achilles tendonbrott (Achilles Tendon Rupture, ATR) är en av de typiska mjukvävnadsskadorna. Rehabilitering efter sådana muskuloskeletala skador förblir en långvarig process med ett mycket variet resultat. Att kunna förutsäga rehabiliteringsresultat exakt är avgörande för beslutsfattande stöduppdrag. I detta arbete designar vi en probabilistisk modell för att förutse rehabiliteringsresultat för ATR med hjälp av en klinisk kohort med många saknade poster. Vår modell är tränad från början till slutet för att samtidigt förutsäga de saknade inmatningarna och rehabiliteringsresultat. Vi utvärderar vår modell och jämför med flera baslinjer, inklusive flerstegsmetoder. Experimentella resultat visar överlägsenheten hos vår modell över dessa flerstadiga tillvägagångssätt med olika dataimuleringsmetoder för ATR rehabiliterings utfalls prognos. / Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries. Rehabilitation after such musculoskeletal injuries remains a prolonged process with a very variable outcome. Being able to predict the rehabilitation outcome accurately is crucial for treatment decision support. In this work, we design a probabilistic model to predict the rehabilitation outcome for ATR using a clinical cohort with numerous missing entries. Our model is trained end-to-end in order to simultaneously predict the missing entries and the rehabilitation outcome. We evaluate our model and compare with multiple baselines, including multi-stage methods. Experimental results demonstrate the superiority of our model over these baseline multi-stage approaches with various data imputation methods for ATR rehabilitation outcome prediction.
|
4 |
Μηχανική μάθηση : Bayesian δίκτυα και εφαρμογέςΧριστακοπούλου, Κωνσταντίνα 13 October 2013 (has links)
Στην παρούσα διπλωματική εργασία πραγματευόμαστε το θέμα της χρήσης των Bayesian Δικτύων -και γενικότερα των Πιθανοτικών Γραφικών Μοντέλων - στη Μηχανική Μάθηση. Στα πρώτα κεφάλαια της εργασίας αυτής παρουσιάζουμε συνοπτικά τη θεωρητική θεμελίωση αυτών των δομημένων πιθανοτικών μοντέλων, η οποία απαρτίζεται από τις βασικές φάσεις της αναπαράστασης, επαγωγής συμπερασμάτων, λήψης αποφάσεων και εκμάθησης από τα διαθέσιμα δεδομένα. Στα επόμενα κεφάλαια, εξετάζουμε ένα ευρύ φάσμα εφαρμογών των πιθανοτικών γραφικών μοντέλων και παρουσιάζουμε τα αποτελέσματα των εξομοιώσεων που υλοποιήσαμε.
Συγκεκριμένα, αρχικά με χρήση γράφων ορίζονται τα Bayesian δίκτυα, Markov δίκτυα και Factor Graphs. Έπειτα, παρουσιάζονται οι αλγόριθμοι επαγωγής συμπερασμάτων που επιτρέπουν τον απευθείας υπολογισμό πιθανοτικών κατανομών από τους γράφους. Διευκολύνεται η λήψη αποφάσεων υπό αβεβαιότητα με τα δέντρα αποφάσεων και τα Influence διαγράμματα. Ακολούθως, μελετάται η εκμάθηση της δομής και των παραμέτρων των πιθανοτικών γραφικών μοντέλων σε παρουσία πλήρους ή μερικού συνόλου δεδομένων. Τέλος, παρουσιάζονται εκτενώς σενάρια τα οποία καταδεικνύουν την εκφραστική δύναμη, την ευελιξία και τη χρηστικότητα των Πιθανοτικών Γραφικών Μοντέλων σε εφαρμογές του πραγματικού κόσμου. / The main subject of this diploma thesis is how probabilistic graphical models can be used in a wide range of real-world scenarios. In the first chapters, we have presented in a concise way the theoretical foundations of graphical models, which consists of the deeply related phases of representation, inference, decision theory and learning from data. In the next chapters, we have worked on many applications, from Optical Character Recognition to Recoginizing Actions and we have presented the results from the simulations.
|
5 |
Nonparametric Discovery of Human Behavior Patterns from Multimodal DataSun, Feng-Tso 01 May 2014 (has links)
Recent advances in sensor technologies and the growing interest in context- aware applications, such as targeted advertising and location-based services, have led to a demand for understanding human behavior patterns from sensor data. People engage in routine behaviors. Automatic routine discovery goes beyond low-level activity recognition such as sitting or standing and analyzes human behaviors at a higher level (e.g., commuting to work). The goal of the research presented in this thesis is to automatically discover high-level semantic human routines from low-level sensor streams. One recent line of research is to mine human routines from sensor data using parametric topic models. The main shortcoming of parametric models is that they assume a fixed, pre-specified parameter regardless of the data. Choosing an appropriate parameter usually requires an inefficient trial-and-error model selection process. Furthermore, it is even more difficult to find optimal parameter values in advance for personalized applications. The research presented in this thesis offers a novel nonparametric framework for human routine discovery that can infer high-level routines without knowing the number of latent low-level activities beforehand. More specifically, the frame-work automatically finds the size of the low-level feature vocabulary from sensor feature vectors at the vocabulary extraction phase. At the routine discovery phase, the framework further automatically selects the appropriate number of latent low-level activities and discovers latent routines. Moreover, we propose a new generative graphical model to incorporate multimodal sensor streams for the human activity discovery task. The hypothesis and approaches presented in this thesis are evaluated on public datasets in two routine domains: two daily-activity datasets and a transportation mode dataset. Experimental results show that our nonparametric framework can automatically learn the appropriate model parameters from multimodal sensor data without any form of manual model selection procedure and can outperform traditional parametric approaches for human routine discovery tasks.
|
6 |
Facial feature localization using highly flexible yet sufficiently strict shape modelsTamersoy, Birgi 18 September 2014 (has links)
Accurate and efficient localization of facial features is a crucial first step in many face-related computer vision tasks. Some of these tasks include, but not limited to: identity recognition, expression recognition, and head-pose estimation. Most effort in the field has been exerted towards developing better ways of modeling prior appearance knowledge and image observations. Modeling prior shape knowledge, on the other hand, has not been explored as much. In this dissertation I primarily focus on the limitations of the existing methods in terms of modeling the prior shape knowledge. I first introduce a new pose-constrained shape model. I describe my shape model as being "highly flexible yet sufficiently strict". Existing pose-constrained shape models are either too strict, and have questionable generalization power, or they are too loose, and have questionable localization accuracies. My model tries to find a good middle-ground by learning which shape constraints are more "informative" and should be kept, and which ones are not-so-important and may be omitted. I build my pose-constrained facial feature localization approach on this new shape model using a probabilistic graphical model framework. Within this framework, observed and unobserved variables are defined as the local image observations, and the feature locations, respectively. Feature localization, or "probabilistic inference", is then achieved by nonparametric belief propagation. I show that this approach outperforms other popular pose-constrained methods through qualitative and quantitative experiments. Next, I expand my pose-constrained localization approach to unconstrained setting using a multi-model strategy. While doing so, once again I identify and address the two key limitations of existing multi-model methods: 1) semantically and manually defining the models or "guiding" their generation, and 2) not having efficient and effective model selection strategies. First, I introduce an approach based on unsupervised clustering where the models are automatically learned from training data. Then, I complement this approach with an efficient and effective model selection strategy, which is based on a multi-class naive Bayesian classifier. This way, my method can have many more models, each with a higher level of expressive power, and consequently, provides a more effective partitioning of the face image space. This approach is validated through extensive experiments and comparisons with state-of-the-art methods on state-of-the-art datasets. In the last part of this dissertation I discuss a particular application of the previously introduced techniques; facial feature localization in unconstrained videos. I improve the frame-by-frame localization results, by estimating the actual head-movement from a sequence of noisy head-pose estimates, and then using this information for detecting and fixing the localization failures. / text
|
7 |
Statistical Text Analysis for Social ScienceO'Connor, Brendan T. 01 August 2014 (has links)
What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying social phenomena, and to reveal how social factors guide linguistic production. This is illustrated through three case studies: first, examining whether sentiment expressed in social media can track opinion polls on economic and political topics (Chapter 3); second, analyzing how novel online slang terms can be very specific to geographic and demographic communities, and how these social factors affect their transmission over time (Chapters 4 and 5); and third, automatically extracting political events from news articles, to assist analyses of the interactions of international actors over time (Chapter 6). We demonstrate a variety of computational, linguistic, and statistical tools that are employed for these analyses, and also contribute MiTextExplorer, an interactive system for exploratory analysis of text data against document covariates, whose design was informed by the experience of researching these and other similar works (Chapter 2). These case studies illustrate recurring themes toward developing text analysis as a social science methodology: computational and statistical complexity, and domain knowledge and linguistic assumptions.
|
8 |
Word meaning in context as a paraphrase distribution : evidence, learning, and inferenceMoon, Taesun, Ph. D. 25 October 2011 (has links)
In this dissertation, we introduce a graph-based model of instance-based, usage meaning that is cast as a problem of probabilistic inference. The main aim of this model is to provide a flexible platform that can be used to explore multiple hypotheses about usage meaning computation. Our model takes up and extends the proposals of Erk and Pado [2007] and McCarthy and Navigli [2009] by representing usage meaning as a probability distribution over potential paraphrases. We use undirected graphical models to infer this probability distribution for every content word in a given sentence. Graphical models represent complex probability distributions through a graph. In the graph, nodes stand for random variables, and edges stand for direct probabilistic interactions between them. The lack of edges between any two variables reflect independence assumptions. In our model, we represent each content word of the sentence through two adjacent nodes: the observed node represents the surface form of the word itself, and the hidden node represents its usage meaning. The distribution over values that we infer for the hidden node is a paraphrase distribution for the observed word. To encode the fact that lexical semantic information is exchanged between syntactic neighbors, the graph contains edges that mirror the dependency graph for the sentence. Further knowledge sources that influence the hidden nodes are represented through additional edges that, for example, connect to document topic. The integration of adjacent knowledge sources is accomplished in a standard way by multiplying factors and marginalizing over variables.
Evaluating on a paraphrasing task, we find that our model outperforms the current state-of-the-art usage vector model [Thater et al., 2010] on all parts of speech except verbs, where the previous model wins by a small margin. But our main focus is not on the numbers but on the fact that our model is flexible enough to encode different hypotheses about usage meaning computation. In particular, we concentrate on five questions (with minor variants):
- Nonlocal syntactic context: Existing usage vector models only use a word's direct syntactic neighbors for disambiguation or inferring some other meaning representation. Would it help to have contextual information instead "flow" along the entire dependency graph, each word's inferred meaning relying on the paraphrase distribution of its neighbors?
- Influence of collocational information: In some cases, it is intuitively plausible to use the selectional preference of a neighboring word towards the target to determine its meaning in context. How does modeling selectional preferences into the model affect performance?
- Non-syntactic bag-of-words context: To what extent can non-syntactic information in the form of bag-of-words context help in inferring meaning?
- Effects of parametrization: We experiment with two transformations of MLE. One interpolates various MLEs and another transforms it by exponentiating pointwise mutual information. Which performs better?
- Type of hidden nodes: Our model posits a tier of hidden nodes immediately adjacent the surface tier of observed words to capture dynamic usage meaning. We examine the model based on by varying the hidden nodes such that in one the nodes have actual words as values and in the other the nodes have nameless indexes as values. The former has the benefit of interpretability while the latter allows more standard parameter estimation.
Portions of this dissertation are derived from joint work between the author and Katrin Erk [submitted]. / text
|
9 |
Sum-Product Network in the context of missing data / Sum-Product Nätverk i samband med saknade dataClavier, Pierre January 2020 (has links)
In recent years, the interest in new Deep Learning methods has increased considerably due to their robustness and applications in many fields. However, the lack of interpretability of these models and the lack of theoretical knowledge about them raises many issues. It is in this context that sum product network models have emerged. From a mathematical point of view, SPNs can be described as Directed Acyclic Graphs. In practice, they can be seen as deep mixture models and as a consequence they can be used to represent very rich collections of distributions. The objective of this master thesis was threefold. First we formalized the concept of SPNs with proper mathematical notations, using the concept of Directed Acyclic Graphs and Bayesian Networks theory. Then we developed a new method for learning the structure of a SPN, based on K-means and Mutual Information Theory. Finally we proposed a new method for the estimation of parameters in a fixed SPN, in the context of incomplete data. Our estimation method is based on maximum likelihood methods with the EM algorithm. / Under de senaste åren har intresset för nya Deep Learning-metoder ökat avsevärt på grund av deras robusthet samt deras tillämpning inom en mängd områden. Bristen på teoretisk kunskap om dessa modeller samt deras svårtolkad karaktär väcker emellertid många frågor. Det är i detta sammanhang som Sum-Product Network kom fram, vilken erbjuder en viss ambivalens då den situerar sig mellan ett linjärt neuralt nätverk utan aktiveringsfunktion och en sannolikhetsgraf. Inom vanliga applikationer med verklig data hittar vi ofta ofullständiga, censurerade eller trunkerad data. Inlärningen av dessa grafer till verklig data är dock fortfarande obefintlig. Syftet med detta examensarbete är att studera några grundläggande egenskaper hos Sum-Product Networks och försöka utöka deras inlärning och uppträning till ofullständig data. Trovärdighetsskattningar med hjälp av EM-algoritmer kommer att användas för att utöka inlärningen av dessa grafer till ofullständiga data.
|
10 |
Measuring Interestingness in Outliers with Explanation Facility using Belief NetworksMasood, Adnan 01 January 2014 (has links)
This research explores the potential of improving the explainability of outliers using Bayesian Belief Networks as background knowledge. Outliers are deviations from the usual trends of data. Mining outliers may help discover potential anomalies and fraudulent activities. Meaningful outliers can be retrieved and analyzed by using domain knowledge. Domain knowledge (or background knowledge) is represented using probabilistic graphical models such as Bayesian belief networks. Bayesian networks are graph-based representation used to model and encode mutual relationships between entities. Due to their probabilistic graphical nature, Belief Networks are an ideal way to capture the sensitivity, causal inference, uncertainty and background knowledge in real world data sets. Bayesian Networks effectively present the causal relationships between different entities (nodes) using conditional probability. This probabilistic relationship shows the degree of belief between entities. A quantitative measure which computes changes in this degree of belief acts as a sensitivity measure .
The first contribution of this research is enhancing the performance for measurement of sensitivity based on earlier research work, the Interestingness Filtering Engine Miner algorithm. The algorithm developed (IBOX - Interestingness based Bayesian outlier eXplainer) provides progressive improvement in the performance and sensitivity scoring of earlier works. Earlier approaches compute sensitivity by measuring divergence among conditional probability of training and test data, while using only couple of probabilistic interestingness measures such as Mutual information and Support to calculate belief sensitivity. With ingrained support from the literature as well as quantitative evidence, IBOX provides a framework to use multiple interestingness measures resulting in better performance and improved sensitivity analysis. The results provide improved performance, and therefore explainability of rare class entities. This research quantitatively validated probabilistic interestingness measures as an effective sensitivity analysis technique in rare class mining. This results in a novel, original, and progressive research contribution to the areas of probabilistic graphical models and outlier analysis.
|
Page generated in 0.085 seconds