Global ETD Search

51	High-dimensional statistics : model specification and elementary estimators Yang, Eunho 16 January 2015 (has links) Modern statistics typically deals with complex data, in particular where the ambient dimension of the problem p may be of the same order as, or even substantially larger than, the sample size n. It has now become well understood that even in this type of high-dimensional scaling, statistically consistent estimators can be achieved provided one imposes structural constraints on the statistical models. In spite of great success over the last few decades, we are still experiencing bottlenecks of two distinct kinds: (I) in multivariate modeling, data modeling assumption is typically limited to instances such as Gaussian or Ising models, and hence handling varied types of random variables is still restricted, and (II) in terms of computation, learning or estimation process is not efficient especially when p is extremely large, since in the current paradigm for high-dimensional statistics, regularization terms induce non-differentiable optimization problems, which do not have closed-form solutions in general. The thesis addresses these two distinct but highly complementary problems: (I) statistical model specification beyond the standard Gaussian or Ising models for data of varied types, and (II) computationally efficient elementary estimators for high-dimensional statistical models. / text High-dimensional statistics Markov random fields Graphical models Closed-form estimators
52	Statistical Text Analysis for Social Science O'Connor, Brendan T. 01 August 2014 (has links) What can text corpora tell us about society? How can automatic text analysis algorithms efficiently and reliably analyze the social processes revealed in language production? This work develops statistical text analyses of dynamic social and news media datasets to extract indicators of underlying social phenomena, and to reveal how social factors guide linguistic production. This is illustrated through three case studies: first, examining whether sentiment expressed in social media can track opinion polls on economic and political topics (Chapter 3); second, analyzing how novel online slang terms can be very specific to geographic and demographic communities, and how these social factors affect their transmission over time (Chapters 4 and 5); and third, automatically extracting political events from news articles, to assist analyses of the interactions of international actors over time (Chapter 6). We demonstrate a variety of computational, linguistic, and statistical tools that are employed for these analyses, and also contribute MiTextExplorer, an interactive system for exploratory analysis of text data against document covariates, whose design was informed by the experience of researching these and other similar works (Chapter 2). These case studies illustrate recurring themes toward developing text analysis as a social science methodology: computational and statistical complexity, and domain knowledge and linguistic assumptions. computational social science natural language processing text mining quantitative text analysis machine learning probabilistic graphical models
53	Word meaning in context as a paraphrase distribution : evidence, learning, and inference Moon, Taesun, Ph. D. 25 October 2011 (has links) In this dissertation, we introduce a graph-based model of instance-based, usage meaning that is cast as a problem of probabilistic inference. The main aim of this model is to provide a flexible platform that can be used to explore multiple hypotheses about usage meaning computation. Our model takes up and extends the proposals of Erk and Pado [2007] and McCarthy and Navigli [2009] by representing usage meaning as a probability distribution over potential paraphrases. We use undirected graphical models to infer this probability distribution for every content word in a given sentence. Graphical models represent complex probability distributions through a graph. In the graph, nodes stand for random variables, and edges stand for direct probabilistic interactions between them. The lack of edges between any two variables reflect independence assumptions. In our model, we represent each content word of the sentence through two adjacent nodes: the observed node represents the surface form of the word itself, and the hidden node represents its usage meaning. The distribution over values that we infer for the hidden node is a paraphrase distribution for the observed word. To encode the fact that lexical semantic information is exchanged between syntactic neighbors, the graph contains edges that mirror the dependency graph for the sentence. Further knowledge sources that influence the hidden nodes are represented through additional edges that, for example, connect to document topic. The integration of adjacent knowledge sources is accomplished in a standard way by multiplying factors and marginalizing over variables. Evaluating on a paraphrasing task, we find that our model outperforms the current state-of-the-art usage vector model [Thater et al., 2010] on all parts of speech except verbs, where the previous model wins by a small margin. But our main focus is not on the numbers but on the fact that our model is flexible enough to encode different hypotheses about usage meaning computation. In particular, we concentrate on five questions (with minor variants): - Nonlocal syntactic context: Existing usage vector models only use a word's direct syntactic neighbors for disambiguation or inferring some other meaning representation. Would it help to have contextual information instead "flow" along the entire dependency graph, each word's inferred meaning relying on the paraphrase distribution of its neighbors? - Influence of collocational information: In some cases, it is intuitively plausible to use the selectional preference of a neighboring word towards the target to determine its meaning in context. How does modeling selectional preferences into the model affect performance? - Non-syntactic bag-of-words context: To what extent can non-syntactic information in the form of bag-of-words context help in inferring meaning? - Effects of parametrization: We experiment with two transformations of MLE. One interpolates various MLEs and another transforms it by exponentiating pointwise mutual information. Which performs better? - Type of hidden nodes: Our model posits a tier of hidden nodes immediately adjacent the surface tier of observed words to capture dynamic usage meaning. We examine the model based on by varying the hidden nodes such that in one the nodes have actual words as values and in the other the nodes have nameless indexes as values. The former has the benefit of interpretability while the latter allows more standard parameter estimation. Portions of this dissertation are derived from joint work between the author and Katrin Erk [submitted]. / text Computational linguistics Lexical semantics Probabilistic graphical models Natural language processing Word sense disambiguation Paraphrasing
54	Greedy structure learning of Markov Random Fields Johnson, Christopher Carroll 04 November 2011 (has links) Probabilistic graphical models are used in a variety of domains to capture and represent general dependencies in joint probability distributions. In this document we examine the problem of learning the structure of an undirected graphical model, also called a Markov Random Field (MRF), given a set of independent and identically distributed (i.i.d.) samples. Specifically, we introduce an adaptive forward-backward greedy algorithm for learning the structure of a discrete, pairwise MRF given a high dimensional set of i.i.d. samples. The algorithm works by greedily estimating the neighborhood of each node independently through a series of forward and backward steps. By imposing a restricted strong convexity condition on the structure of the learned graph we show that the structure can be fully learned with high probability given $n=\Omega(d\log (p))$ samples where $d$ is the dimension of the graph and $p$ is the number of nodes. This is a significant improvement over existing convex-optimization based algorithms that require a sample complexity of $n=\Omega(d^2\log(p))$ and a stronger irrepresentability condition. We further support these claims with an empirical comparison of the greedy algorithm to node-wise $\ell_1$-regularized logistic regression as well as provide a real data analysis of the greedy algorithm using the Audioscrobbler music listener dataset. The results of this document provide an additional representation of work submitted by A. Jalali, C. Johnson, and P. Ravikumar to NIPS 2011. / text Machine learning Graphical models Markov Random Fields Structure learning Probability Uncertainty Greedy algorithms
55	Covariate selection and propensity score specification in causal inference Waernbaum, Ingeborg January 2008 (has links) This thesis makes contributions to the statistical research field of causal inference in observational studies. The results obtained are directly applicable in many scientific fields where effects of treatments are investigated and yet controlled experiments are difficult or impossible to implement. In the first paper we define a partially specified directed acyclic graph (DAG) describing the independence structure of the variables under study. Using the DAG we show that given that unconfoundedness holds we can use the observed data to select minimal sets of covariates to control for. General covariate selection algorithms are proposed to target the defined minimal subsets. The results of the first paper are generalized in Paper II to include the presence of unobserved covariates. Morevoer, the identification assumptions from the first paper are relaxed. To implement the covariate selection without parametric assumptions we propose in the third paper the use of a model-free variable selection method from the framework of sufficient dimension reduction. By simulation the performance of the proposed selection methods are investigated. Additionally, we study finite sample properties of treatment effect estimators based on the selected covariate sets. In paper IV we investigate misspecifications of parametric models of a scalar summary of the covariates, the propensity score. Motivated by common model specification strategies we describe misspecifications of parametric models for which unbiased estimators of the treatment effect are available. Consequences of the misspecification for the efficiency of treatment effect estimators are also studied. Covariate selection graphical models matching observational studies treatment effects unconfoundedness Statistics Statistik
56	Spatiotemporal Gene Networks from ISH Images Puniyani, Kriti 01 September 2013 (has links) As large-scale techniques for studying and measuring gene expressions have been developed, automatically inferring gene interaction networks from expression data has emerged as a popular technique to advance our understanding of cellular systems. Accurate prediction of gene interactions, especially in multicellular organisms such as Drosophila or humans, requires temporal and spatial analysis of gene expressions, which is not easily obtainable from microarray data. New image based techniques using in-sit hybridization(ISH) have recently been developed to allowlarge-scale spatial-temporal profiling of whole body mRNA expression. However, analysis of such data for discovering new gene interactions still remains an open challenge. This thesis studies the question of predicting gene interaction networks from ISH data in three parts. First, we present SPEX2, a computer vision pipeline to extract informative features from ISH data. Next, we present an algorithm, GINI, for learning spatial gene interaction networks from embryonic ISH images at a single time step. GINI combines multi-instance kernels with recent work in learning sparse undirected graphical models to predict interactions between genes. Finally, we propose NP-MuScL (nonparanormal multi source learning) to estimate a gene interaction network that is consistent with multiple sources of data, having the same underlying relationships between the nodes. NP-MuScL casts the network estimation problem as estimating the structure of a sparse undirected graphical model. We use the semiparametric Gaussian copula to model the distribution of the different data sources, with the different copulas sharing the same covariance matrix, and show how to estimate such a model in the high dimensional scenario. We apply our algorithms on more than 100,000 Drosophila embryonic ISH images from the Berkeley Drosophila Genome Project. Each of the 6 time steps in Drosophila embryonic development is treated as a separate data source. With spatial gene interactions predicted via GINI, and temporal predictions combined via NP-MuScL, we are finally able to predict spatiotemporal gene networks from these images. bioimaging Gaussian graphical models gene networks sparsity gene expression high dimensional inference
57	Continuous Graphical Models for Static and Dynamic Distributions: Application to Structural Biology Razavian, Narges Sharif 01 September 2013 (has links) Generative models of protein structure enable researchers to predict the behavior of proteins under different conditions. Continuous graphical models are powerful and efficient tools for modeling static and dynamic distributions, which can be used for learning generative models of molecular dynamics. In this thesis, we develop new and improved continuous graphical models, to be used in modeling of protein structure. We first present von Mises graphical models, and develop consistent and efficient algorithms for sparse structure learning and parameter estimation, and inference. We compare our model to sparse Gaussian graphical model and show it outperforms GGMs on synthetic and Engrailed protein molecular dynamics datasets. Next, we develop algorithms to estimate Mixture of von Mises graphical models using Expectation Maximization, and show that these models outperform Von Mises, Gaussian and mixture of Gaussian graphical models in terms of accuracy of prediction in imputation test of non-redundant protein structure datasets. We then use non-paranormal and nonparametric graphical models, which have extensive representation power, and compare several state of the art structure learning methods that can be used prior to nonparametric inference in reproducing kernel Hilbert space embedded graphical models. To be able to take advantage of the nonparametric models, we also propose feature space embedded belief propagation, and use random Fourier based feature approximation in our proposed feature belief propagation, to scale the inference algorithm to larger datasets. To improve the scalability further, we also show the integration of Coreset selection algorithm with the nonparametric inference, and show that the combined model scales to large datasets with very small adverse effect on the quality of predictions. Finally, we present time varying sparse Gaussian graphical models, to learn smoothly varying graphical models of molecular dynamics simulation data, and present results on CypA protein machine learning graphical models kernel belief propagation structural biology structure learning inference
58	Normal Factor Graphs Al-Bashabsheh, Ali 25 February 2014 (has links) This thesis introduces normal factor graphs under a new semantics, namely, the exterior function semantics. Initially, this work was motivated by two distinct lines of research. One line is ``holographic algorithms,'' a powerful approach introduced by Valiant for solving various counting problems in computer science; the other is ``normal graphs,'' an elegant framework proposed by Forney for representing codes defined on graphs. The nonrestrictive normality constraint enables the notion of holographic transformations for normal factor graphs. We establish a theorem, called the generalized Holant theorem, which relates a normal factor graph to its holographic transformation. We show that the generalized Holant theorem on one hand underlies the principle of holographic algorithms, and on the other reduces to a general duality theorem for normal factor graphs, a special case of which was first proved by Forney. As an application beyond Forney's duality, we show that the normal factor graphs duality facilitates the approximation of the partition function for the two-dimensional nearest-neighbor Potts model. In the course of our development, we formalize a new semantics for normal factor graphs, which highlights various linear algebraic properties that enables the use of normal factor graphs as a linear algebraic tool. Indeed, we demonstrate the ability of normal factor graphs to encode several concepts from linear algebra and present normal factor graphs as a generalization of ``trace diagrams.'' We illustrate, with examples, the workings of this framework and how several identities from linear algebra may be obtained using a simple graphical manipulation procedure called ``vertex merging/splitting.'' We also discuss translation association schemes with the aid of normal factor graphs, which we believe provides a simple approach to understanding the subject. Further, under the new semantics, normal factor graphs provide a probabilistic model that unifies several graphical models such as factor graphs, convolutional factor graphs, and cumulative distribution networks. Holographic transformations sum of products partition function probabilistic models graphical models factor graphs trace diagrams
59	ARMA Identification of Graphical Models Avventi, Enrico, Lindquist, Anders, Wahlberg, Bo January 2013 (has links) Consider a Gaussian stationary stochastic vector process with the property that designated pairs of components are conditionally independent given the rest of the components. Such processes can be represented on a graph where the components are nodes and the lack of a connecting link between two nodes signifies conditional independence. This leads to a sparsity pattern in the inverse of the matrix-valued spectral density. Such graphical models find applications in speech, bioinformatics, image processing, econometrics and many other fields, where the problem to fit an autoregressive (AR) model to such a process has been considered. In this paper we take this problem one step further, namely to fit an autoregressive moving-average (ARMA) model to the same data. We develop a theoretical framework and an optimization procedure which also spreads further light on previous approaches and results. This procedure is then applied to the identification problem of estimating the ARMA parameters as well as the topology of the graph from statistical data. / <p>Updated from "Preprint" to "Article" QC 20130627</p> conditional independence graphical models system identification MATHEMATICS MATEMATIK
60	Ising Graphical Model Kamenetsky, Dmitry, dkamen@rsise.anu.edu.au January 2010 (has links) The Ising model is an important model in statistical physics, with over 10,000 papers published on the topic. This model assumes binary variables and only local pairwise interactions between neighbouring nodes. Inference for the general Ising model is NP-hard; this includes tasks such as calculating the partition function, finding a lowest-energy (ground) state and computing marginal probabilities. Past approaches have proceeded by working with classes of tractable Ising models, such as Ising models defined on a planar graph. For such models, the partition function and ground state can be computed exactly in polynomial time by establishing a correspondence with perfect matchings in a related graph. In this thesis we continue this line of research. In particular we simplify previous inference algorithms for the planar Ising model. The key to our construction is the complementary correspondence between graph cuts of the model graph and perfect matchings of its expanded dual. We show that our exact algorithms are effective and efficient on a number of real-world machine learning problems. We also investigate heuristic methods for approximating ground states of non-planar Ising models. We show that in this setting our approximative algorithms are superior than current state-of-the-art methods. Ising model graphical models computer vision machine learning image segmentation computer go

Search results