Global ETD Search

11	Monte Carlo integration in discrete undirected probabilistic models Hamze, Firas 05 1900 (has links) This thesis contains the author’s work in and contributions to the field of Monte Carlo sampling for undirected graphical models, a class of statistical model commonly used in machine learning, computer vision, and spatial statistics; the aim is to be able to use the methodology and resultant samples to estimate integrals of functions of the variables in the model. Over the course of the study, three different but related methods were proposed and have appeared as research papers. The thesis consists of an introductory chapter discussing the models considered, the problems involved, and a general outline of Monte Carlo methods. The three subsequent chapters contain versions of the published work. The second chapter, which has appeared in (Hamze and de Freitas 2004), is a presentation of new MCMC algorithms for computing the posterior distributions and expectations of the unknown variables in undirected graphical models with regular structure. For demonstration purposes, we focus on Markov Random Fields (MRFs). By partitioning the MRFs into non-overlapping trees, it is possible to compute the posterior distribution of a particular tree exactly by conditioning on the remaining tree. These exact solutions allow us to construct efficient blocked and Rao-Blackwellised MCMC algorithms. We show empirically that tree sampling is considerably more efficient than other partitioned sampling schemes and the naive Gibbs sampler, even in cases where loopy belief propagation fails to converge. We prove that tree sampling exhibits lower variance than the naive Gibbs sampler and other naive partitioning schemes using the theoretical measure of maximal correlation. We also construct new information theory tools for comparing different MCMC schemes and show that, under these, tree sampling is more efficient. Although the work discussed in Chapter 2 exhibited promise on the class of graphs to which it was suited, there are many cases where limiting the topology is quite a handicap. The work in Chapter 3 was an exploration in an alternative methodology for approximating functions of variables representable as undirected graphical models of arbitrary connectivity with pairwise potentials, as well as for estimating the notoriously difficult partition function of the graph. The algorithm, published in (Hamze and de Freitas 2005), fits into the framework of sequential Monte Carlo methods rather than the more widely used MCMC, and relies on constructing a sequence of intermediate distributions which get closer to the desired one. While the idea of using “tempered” proposals is known, we construct a novel sequence of target distributions where, rather than dropping a global temperature parameter, we sequentially couple individual pairs of variables that are, initially, sampled exactly from a spanning treeof the variables. We present experimental results on inference and estimation of the partition function for sparse and densely-connected graphs. The final contribution of this thesis, presented in Chapter 4 and also in (Hamze and de Freitas 2007), emerged from some empirical observations that were made while trying to optimize the sequence of edges to add to a graph so as to guide the population of samples to the high-probability regions of the model. Most important among these observations was that while several heuristic approaches, discussed in Chapter 1, certainly yielded improvements over edge sequences consisting of random choices, strategies based on forcing the particles to take large, biased random walks in the state-space resulted in a more efficient exploration, particularly at low temperatures. This motivated a new Monte Carlo approach to treating complex discrete distributions. The algorithm is motivated by the N-Fold Way, which is an ingenious event-driven MCMC sampler that avoids rejection moves at any specific state. The N-Fold Way can however get “trapped” in cycles. We surmount this problem by modifying the sampling process to result in biased state-space paths of randomly chosen length. This alteration does introduce bias, but the bias is subsequently corrected with a carefully engineered importance sampler. Graphical models Monte Carlo inference Bayesian inference
12	Graphical Models: Modeling, Optimization, and Hilbert Space Embedding Zhang, Xinhua, xinhua.zhang.cs@gmail.com January 2010 (has links) Over the past two decades graphical models have been widely used as powerful tools for compactly representing distributions. On the other hand, kernel methods have been used extensively to come up with rich representations. This thesis aims to combine graphical models with kernels to produce compact models with rich representational abilities. Graphical models are a powerful underlying formalism in machine learning. Their graph theoretic properties provide both an intuitive modular interface to model the interacting factors, and a data structure facilitating efficient learning and inference. The probabilistic nature ensures the global consistency of the whole framework, and allows convenient interface of models to data. Kernel methods, on the other hand, provide an effective means of representing rich classes of features for general objects, and at the same time allow efficient search for the optimal model. Recently, kernels have been used to characterize distributions by embedding them into high dimensional feature space. Interestingly, graphical models again decompose this characterization and lead to novel and direct ways of comparing distributions based on samples. Among the many uses of graphical models and kernels, this thesis is devoted to the following four areas: Conditional random fields for multi-agent reinforcement learning Conditional random fields (CRFs) are graphical models for modelling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the label sequences of different training examples are independent and identically distributed (iid ). We extended the use of CRFs to a class of temporal learning algorithms, namely policy gradient reinforcement learning (RL). Now the labels are no longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal joint action. We tested our framework on a synthetic network alignment problem, a distributed sensor network, and a road traffic control system. Using tree sampling by Hamze & de Freitas (2004) for inference, the RL methods employing CRFs clearly outperform those which do not model the proper joint policy. Bayesian online multi-label classification Gaussian density filtering (GDF) provides fast and effective inference for graphical models (Maybeck, 1982). Based on this natural online learner, we propose a Bayesian online multi-label classification (BOMC) framework which learns a probabilistic model of the linear classifier. The training labels are incorporated to update the posterior of the classifiers via a graphical model similar to TrueSkill (Herbrich et al., 2007), and inference is based on GDF with expectation propagation. Using samples from the posterior, we label the test data by maximizing the expected F-score. Our experiments on Reuters1-v2 dataset show that BOMC delivers significantly higher macro-averaged F-score than the state-of-the-art online maximum margin learners such as LaSVM (Bordes et al., 2005) and passive aggressive online learning (Crammer et al., 2006). The online nature of BOMC also allows us to effciently use a large amount of training data. Hilbert space embedment of distributions Graphical models are also an essential tool in kernel measures of independence for non-iid data. Traditional information theory often requires density estimation, which makes it unideal for statistical estimation. Motivated by the fact that distributions often appear in machine learning via expectations, we can characterize the distance between distributions in terms of distances between means, especially means in reproducing kernel Hilbert spaces which are called kernel embedment. Under this framework, the undirected graphical models further allow us to factorize the kernel embedment onto cliques, which yields efficient measures of independence for non-iid data (Zhang et al., 2009). We show the effectiveness of this framework for ICA and sequence segmentation, and a number of further applications and research questions are identified. Optimization in maximum margin models for structured data Maximum margin estimation for structured data, e.g. (Taskar et al., 2004), is an important task in machine learning where graphical models also play a key role. They are special cases of regularized risk minimization, for which bundle methods (BMRM, Teo et al., 2007) and the closely related SVMStruct (Tsochantaridis et al., 2005) are state-of-the-art general purpose solvers. Smola et al. (2007b) proved that BMRM requires O(1/έ) iterations to converge to an έ accurate solution, and we further show that this rate hits the lower bound. By utilizing the structure of the objective function, we devised an algorithm for the structured loss which converges to an έ accurate solution in O(1/√έ) iterations. This algorithm originates from Nesterov's optimal first order methods (Nesterov, 2003, 2005b). Machine Learning Graphical Models Kernel Methods Optimization
13	Combining Object and Feature Dynamics in Probabilistic Tracking Taycher, Leonid, Fisher III, John W., Darrell, Trevor 02 March 2005 (has links) Objects can exhibit different dynamics at different scales, a property that isoftenexploited by visual tracking algorithms. A local dynamicmodel is typically used to extract image features that are then used as inputsto a system for tracking the entire object using a global dynamic model.Approximate local dynamicsmay be brittle---point trackers drift due to image noise and adaptivebackground models adapt to foreground objects that becomestationary---but constraints from the global model can make them more robust.We propose a probabilistic framework for incorporating globaldynamics knowledge into the local feature extraction processes.A global tracking algorithm can beformulated as a generative model and used to predict feature values thatinfluence the observation process of thefeature extractor. We combine such models in a multichain graphicalmodel framework.We show the utility of our framework for improving feature tracking and thusshapeand motion estimates in a batch factorization algorithm.We also propose an approximate filtering algorithm appropriate for onlineapplications, and demonstrate its application to problems such as backgroundsubtraction, structure from motion and articulated body tracking. AI graphical models feature extraction tracking
14	Monte Carlo integration in discrete undirected probabilistic models Hamze, Firas 05 1900 (has links) This thesis contains the author’s work in and contributions to the field of Monte Carlo sampling for undirected graphical models, a class of statistical model commonly used in machine learning, computer vision, and spatial statistics; the aim is to be able to use the methodology and resultant samples to estimate integrals of functions of the variables in the model. Over the course of the study, three different but related methods were proposed and have appeared as research papers. The thesis consists of an introductory chapter discussing the models considered, the problems involved, and a general outline of Monte Carlo methods. The three subsequent chapters contain versions of the published work. The second chapter, which has appeared in (Hamze and de Freitas 2004), is a presentation of new MCMC algorithms for computing the posterior distributions and expectations of the unknown variables in undirected graphical models with regular structure. For demonstration purposes, we focus on Markov Random Fields (MRFs). By partitioning the MRFs into non-overlapping trees, it is possible to compute the posterior distribution of a particular tree exactly by conditioning on the remaining tree. These exact solutions allow us to construct efficient blocked and Rao-Blackwellised MCMC algorithms. We show empirically that tree sampling is considerably more efficient than other partitioned sampling schemes and the naive Gibbs sampler, even in cases where loopy belief propagation fails to converge. We prove that tree sampling exhibits lower variance than the naive Gibbs sampler and other naive partitioning schemes using the theoretical measure of maximal correlation. We also construct new information theory tools for comparing different MCMC schemes and show that, under these, tree sampling is more efficient. Although the work discussed in Chapter 2 exhibited promise on the class of graphs to which it was suited, there are many cases where limiting the topology is quite a handicap. The work in Chapter 3 was an exploration in an alternative methodology for approximating functions of variables representable as undirected graphical models of arbitrary connectivity with pairwise potentials, as well as for estimating the notoriously difficult partition function of the graph. The algorithm, published in (Hamze and de Freitas 2005), fits into the framework of sequential Monte Carlo methods rather than the more widely used MCMC, and relies on constructing a sequence of intermediate distributions which get closer to the desired one. While the idea of using “tempered” proposals is known, we construct a novel sequence of target distributions where, rather than dropping a global temperature parameter, we sequentially couple individual pairs of variables that are, initially, sampled exactly from a spanning treeof the variables. We present experimental results on inference and estimation of the partition function for sparse and densely-connected graphs. The final contribution of this thesis, presented in Chapter 4 and also in (Hamze and de Freitas 2007), emerged from some empirical observations that were made while trying to optimize the sequence of edges to add to a graph so as to guide the population of samples to the high-probability regions of the model. Most important among these observations was that while several heuristic approaches, discussed in Chapter 1, certainly yielded improvements over edge sequences consisting of random choices, strategies based on forcing the particles to take large, biased random walks in the state-space resulted in a more efficient exploration, particularly at low temperatures. This motivated a new Monte Carlo approach to treating complex discrete distributions. The algorithm is motivated by the N-Fold Way, which is an ingenious event-driven MCMC sampler that avoids rejection moves at any specific state. The N-Fold Way can however get “trapped” in cycles. We surmount this problem by modifying the sampling process to result in biased state-space paths of randomly chosen length. This alteration does introduce bias, but the bias is subsequently corrected with a carefully engineered importance sampler. / Science, Faculty of / Computer Science, Department of / Graduate Graphical models Monte Carlo inference Bayesian inference
15	A Molecular Modeling Toolkit with Applications to Efficient Free Energy Computation Tezcan, Hasan Gokhan 01 January 2010 (has links) (PDF) In this thesis we develop a molecular modeling toolkit that models the conformation space of proteins and allows easy prototyping of algorithms on the conformation space models, by extending an existing molecular modeling tool. Our toolkit creates a factor graph to represent the conformation space model and links it with an inference framework. This enables execution of statistical inference tasks and implementation of algorithms that work on graphical models. As an application of our toolkit, we study estimating free energy changes after mutations. We show that it is possible to represent molecular dynamics trajectories using graphical models and free energy perturbation calculations can be done efficiently on these models. proteins free energy conformation space graphical models
16	Apprentissage de graphes structuré et parcimonieux dans des données de haute dimension avec applications à l’imagerie cérébrale / Structured Sparse Learning on Graphs in High-Dimensional Data with Applications to NeuroImaging Belilovsky, Eugene 02 March 2018 (has links) Cette thèse présente de nouvelles méthodes d’apprentissage structuré et parcimonieux sur les graphes, ce qui permet de résoudre une large variété de problèmes d’imagerie cérébrale, ainsi que d’autres problèmes en haute dimension avec peu d’échantillon. La première partie de cette thèse propose des relaxation convexe de pénalité discrète et combinatoriale impliquant de la parcimonie et bounded total variation d’un graphe, ainsi que la bounded `2. Ceux-ci sont dévelopé dansle but d’apprendre un modèle linéaire interprétable et on démontre son efficacacité sur des données d’imageries cérébrales ainsi que sur les problèmes de reconstructions parcimonieux.Les sections successives de cette thèse traite de la découverte de structure sur des modèles graphiques “undirected” construit à partir de peu de données. En particulier, on se concentre sur des hypothèses de parcimonie et autres hypothèses de structures dans les modèles graphiques gaussiens. Deux contributions s’en dégagent. On construit une approche pour identifier les différentes entre des modèles graphiques gaussiens (GGMs) qui partagent la même structure sous-jacente. On dérive la distribution de différences de paramètres sous une pénalité jointe quand la différence des paramètres est parcimonieuse. On montre ensuite comment cette approche peut être utilisée pour obtenir des intervalles de confiances sur les différences prises par le GGM sur les arêtes. De là, on introduit un nouvel algorithme d’apprentissage lié au problème de découverte de structure sur les modèles graphiques non dirigées des échantillons observés. On démontre que les réseaux de neurones peuvent être utilisés pour apprendre des estimateurs efficacaces de ce problèmes. On montre empiriquement que ces méthodes sont une alternatives flexible et performantes par rapport aux techniques existantes. / This dissertation presents novel structured sparse learning methods on graphs that address commonly found problems in the analysis of neuroimaging data as well as other high dimensional data with few samples. The first part of the thesis proposes convex relaxations of discrete and combinatorial penalties involving sparsity and bounded total variation on a graph as well as bounded `2 norm. These are developed with the aim of learning an interpretable predictive linear model and we demonstrate their effectiveness on neuroimaging data as well as a sparse image recovery problem.The subsequent parts of the thesis considers structure discovery of undirected graphical models from few observational data. In particular we focus on invoking sparsity and other structured assumptions in Gaussian Graphical Models (GGMs). To this end we make two contributions. We show an approach to identify differences in Gaussian Graphical Models (GGMs) known to have similar structure. We derive the distribution of parameter differences under a joint penalty when parameters are known to be sparse in the difference. We then show how this approach can be used to obtain confidence intervals on edge differences in GGMs. We then introduce a novel learning based approach to the problem structure discovery of undirected graphical models from observational data. We demonstrate how neural networks can be used to learn effective estimators for this problem. This is empirically shown to be flexible and efficient alternatives to existing techniques. Apprentissage statistique Gaussian graphical models Neuroimagerie Apprentissage profond Machine learning Gaussian graphical models Neuroimaging Deep learning
17	Machine Learning in Computational Biology: Models of Alternative Splicing Shai, Ofer 03 March 2010 (has links) Alternative splicing, the process by which a single gene may code for similar but different proteins, is an important process in biology, linked to development, cellular differentiation, genetic diseases, and more. Genome-wide analysis of alternative splicing patterns and regulation has been recently made possible due to new high throughput techniques for monitoring gene expression and genomic sequencing. This thesis introduces two algorithms for alternative splicing analysis based on large microarray and genomic sequence data. The algorithms, based on generative probabilistic models that capture structure and patterns in the data, are used to study global properties of alternative splicing. In the first part of the thesis, a microarray platform for monitoring alternative splicing is introduced. A spatial noise removal algorithm that removes artifacts and improves data fidelity is presented. The GenASAP algorithm (generative model for alternative splicing array platform) models the non-linear process in which targeted molecules bind to a microarray’s probes and is used to predict patterns of alternative splicing. Two versions of GenASAP have been developed. The first uses variational approximation to infer the relative amounts of the targeted molecules, while the second incorporates a more accurate noise and generative model and utilizes Markov chain Monte Carlo (MCMC) sampling. GenASAP, the first method to provide quantitative predictions of alternative splicing patterns on large scale data sets, is shown to generate useful and precise predictions based on independent RT-PCR validation (a slow but more accurate approach to measuring cellular expression patterns). In the second part of the thesis, the results obtained by GenASAP are analysed to reveal jointly regulated genes. The sequences of the genes are examined for potential regulatory factors binding sites using a new motif finding algorithm designed for this purpose. The motif finding algorithm, called GenBITES (generative model for binding sites) uses a fully Bayesian generative model for sequences, and the MCMC approach used for inference in the model includes moves that can efficiently create or delete motifs, and extend or contract the width of existing motifs. GenBITES has been applied to several synthetic and real data sets, and is shown to be highly competitive at a task for which many algorithms already exist. Although developed to analyze alternative splicing data, GenBITES outperforms most reported results on a benchmark data set based on transcription data. Machine Learning Graphical Models Computational Biology Alternative Splicing
18	Message Passing Algorithms for Facility Location Problems Lazic, Nevena 09 June 2011 (has links) Discrete location analysis is one of the most widely studied branches of operations research, whose applications arise in a wide variety of settings. This thesis describes a powerful new approach to facility location problems - that of message passing inference in probabilistic graphical models. Using this framework, we develop new heuristic algorithms, as well as a new approximation algorithm for a particular problem type. In machine learning applications, facility location can be seen a discrete formulation of clustering and mixture modeling problems. We apply the developed algorithms to such problems in computer vision. We tackle the problem of motion segmentation in video sequences by formulating it as a facility location instance and demonstrate the advantages of message passing algorithms over current segmentation methods. probabilistic graphical models facility location MAP inference 0544
19	Spectral Probablistic Modeling and Applications to Natural Language Processing Parikh, Ankur 01 August 2015 (has links) Probabilistic modeling with latent variables is a powerful paradigm that has led to key advances in many applications such natural language processing, text mining, and computational biology. Unfortunately, while introducing latent variables substantially increases representation power, learning and modeling can become considerably more complicated. Most existing solutions largely ignore non-identifiability issues in modeling and formulate learning as a nonconvex optimization problem, where convergence to the optimal solution is not guaranteed due to local minima. In this thesis, we propose to tackle these problems through the lens of linear/multi-linear algebra. Viewing latent variable models from this perspective allows us to approach key problems such as structure learning and parameter learning using tools such as matrix/tensor decompositions, inversion, and additive metrics. These new tools enable us to develop novel solutions to learning in latent variable models with theoretical and practical advantages. For example, our spectral parameter learning methods for latent trees and junction trees are provably consistent, local-optima-free, and 1-2 orders of magnitude faster thanEMfor large sample sizes. In addition, we focus on applications in Natural Language Processing, using our insights to not only devise new algorithms, but also to propose new models. Our method for unsupervised parsing is the first algorithm that has both theoretical guarantees and is also practical, performing favorably to theCCMmethod of Klein and Manning. We also developed power low rank ensembles, a framework for language modeling that generalizes existing n-gram techniques to non-integer n. It consistently outperforms state-of-the-art Kneser Ney baselines and can train on billion-word datasets in a few hours. probabilistic graphical models spectral methods kernels unsupervised parsing language modeling
20	Maximum Likelihood Identification of an Information Matrix Under Constraints in a Corresponding Graphical Model Li, Nan 22 January 2017 (has links) We address the problem of identifying the neighborhood structure of an undirected graph, whose nodes are labeled with the elements of a multivariate normal (MVN) random vector. A semi-definite program is given for estimating the information matrix under arbitrary constraints on its elements. More importantly, a closed-form expression is given for the maximum likelihood (ML) estimator of the information matrix, under the constraint that the information matrix has pre-specified elements in a given pattern (e.g., in a principal submatrix). The results apply to the identification of dependency labels in a graphical model with neighborhood constraints. This neighborhood structure excludes nodes which are conditionally independent of a given node and the graph is determined by the non- zero elements in the information matrix for the random vector. A cross-validation principle is given for determining whether the constrained information matrix returned from this procedure is an acceptable model for the information matrix, and as a consequence for the neighborhood structure of the Markov Random Field (MRF) that is identified with the MVN random vector. Closed Form Solutions Convex Optimization Information Matrix Graphical Models

Search results