111 |
Probabilistic Independence Networks for Hidden Markov Probability ModelsSmyth, Padhraic, Heckerman, David, Jordan, Michael 13 March 1996 (has links)
Graphical techniques for modeling the dependencies of randomvariables have been explored in a variety of different areas includingstatistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics.Formalisms for manipulating these models have been developedrelatively independently in these research communities. In this paper weexplore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independencenetworks (PINs). The paper contains a self-contained review of the basic principles of PINs.It is shown that the well-known forward-backward (F-B) and Viterbialgorithms for HMMs are special cases of more general inference algorithms forarbitrary PINs. Furthermore, the existence of inference and estimationalgorithms for more general graphical models provides a set of analysistools for HMM practitioners who wish to explore a richer class of HMMstructures.Examples of relatively complex models to handle sensorfusion and coarticulationin speech recognitionare introduced and treated within the graphical model framework toillustrate the advantages of the general approach.
|
112 |
Bayesian Methods in Gaussian Graphical ModelsMitsakakis, Nikolaos 31 August 2010 (has links)
This thesis contributes to the field of Gaussian Graphical Models by exploring either numerically or theoretically various topics of Bayesian Methods in Gaussian Graphical Models and by providing a number of interesting results, the further exploration of which would be promising, pointing to numerous future research directions.
Gaussian Graphical Models are statistical methods for the investigation and representation of interdependencies between components of continuous random vectors. This thesis aims to investigate some issues related to the application of Bayesian methods for Gaussian Graphical Models. We adopt the popular $G$-Wishart conjugate prior $W_G(\delta,D)$ for the precision matrix. We propose an efficient sampling method for the $G$-Wishart distribution based on the Metropolis Hastings algorithm and show its validity through a number of numerical experiments. We show that this method can be easily used to estimate the Deviance Information Criterion, providing a computationally inexpensive approach for model selection.
In addition, we look at the marginal likelihood of a graphical model given a set of data. This is proportional to the ratio of the posterior over the prior normalizing constant. We explore methods for the estimation of this ratio, focusing primarily on applying the Monte Carlo simulation method of path sampling. We also explore numerically the effect of the completion of the incomplete matrix $D^{\mathcal{V}}$, hyperparameter of the $G$-Wishart distribution, for the estimation of the normalizing constant.
We also derive a series of exact and approximate expressions for the Bayes Factor between two graphs that differ by one edge. A new theoretical result regarding the limit of the normalizing constant multiplied by the hyperparameter $\delta$ is given and its implications to the validity of an improper prior and of the subsequent Bayes Factor are discussed.
|
113 |
Composable, Distributed-state Models for High-dimensional Time SeriesTaylor, Graham William 03 March 2010 (has links)
In this thesis we develop a class of nonlinear generative models for high-dimensional time series. The first key property of these models is their distributed, or "componential" latent state, which is characterized by binary stochastic variables which interact to explain the data. The second key property is the use of an undirected graphical model to represent the relationship between latent state (features) and observations. The final key property is composability: the proposed class of models can form the building blocks of deep networks by successively training each model on the features extracted by the previous one.
We first propose a model based on the Restricted Boltzmann Machine (RBM) that uses an undirected model with binary latent variables and real-valued "visible" variables. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. This "conditional" RBM (CRBM) makes on-line inference efficient and allows us to use a simple approximate learning procedure. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. We also explore CRBMs as priors in the context of Bayesian filtering applied to multi-view and monocular 3D person tracking.
We extend the CRBM in a way that preserves its most important computational properties and introduces multiplicative three-way interactions that allow the effective interaction weight between two variables to be modulated by the dynamic state of a third variable. We introduce a factoring of the implied three-way weight tensor to permit a more compact parameterization. The resulting model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve its ability to blend motion styles or to transition smoothly among them.
In separate but related work, we revisit Products of Hidden Markov Models (PoHMMs). We show how the partition function can be estimated reliably via Annealed Importance Sampling. This enables us to demonstrate that PoHMMs outperform various flavours of HMMs on a variety of tasks and metrics, including log likelihood.
|
114 |
Bayesian Methods in Gaussian Graphical ModelsMitsakakis, Nikolaos 31 August 2010 (has links)
This thesis contributes to the field of Gaussian Graphical Models by exploring either numerically or theoretically various topics of Bayesian Methods in Gaussian Graphical Models and by providing a number of interesting results, the further exploration of which would be promising, pointing to numerous future research directions.
Gaussian Graphical Models are statistical methods for the investigation and representation of interdependencies between components of continuous random vectors. This thesis aims to investigate some issues related to the application of Bayesian methods for Gaussian Graphical Models. We adopt the popular $G$-Wishart conjugate prior $W_G(\delta,D)$ for the precision matrix. We propose an efficient sampling method for the $G$-Wishart distribution based on the Metropolis Hastings algorithm and show its validity through a number of numerical experiments. We show that this method can be easily used to estimate the Deviance Information Criterion, providing a computationally inexpensive approach for model selection.
In addition, we look at the marginal likelihood of a graphical model given a set of data. This is proportional to the ratio of the posterior over the prior normalizing constant. We explore methods for the estimation of this ratio, focusing primarily on applying the Monte Carlo simulation method of path sampling. We also explore numerically the effect of the completion of the incomplete matrix $D^{\mathcal{V}}$, hyperparameter of the $G$-Wishart distribution, for the estimation of the normalizing constant.
We also derive a series of exact and approximate expressions for the Bayes Factor between two graphs that differ by one edge. A new theoretical result regarding the limit of the normalizing constant multiplied by the hyperparameter $\delta$ is given and its implications to the validity of an improper prior and of the subsequent Bayes Factor are discussed.
|
115 |
Composable, Distributed-state Models for High-dimensional Time SeriesTaylor, Graham William 03 March 2010 (has links)
In this thesis we develop a class of nonlinear generative models for high-dimensional time series. The first key property of these models is their distributed, or "componential" latent state, which is characterized by binary stochastic variables which interact to explain the data. The second key property is the use of an undirected graphical model to represent the relationship between latent state (features) and observations. The final key property is composability: the proposed class of models can form the building blocks of deep networks by successively training each model on the features extracted by the previous one.
We first propose a model based on the Restricted Boltzmann Machine (RBM) that uses an undirected model with binary latent variables and real-valued "visible" variables. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. This "conditional" RBM (CRBM) makes on-line inference efficient and allows us to use a simple approximate learning procedure. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. We also explore CRBMs as priors in the context of Bayesian filtering applied to multi-view and monocular 3D person tracking.
We extend the CRBM in a way that preserves its most important computational properties and introduces multiplicative three-way interactions that allow the effective interaction weight between two variables to be modulated by the dynamic state of a third variable. We introduce a factoring of the implied three-way weight tensor to permit a more compact parameterization. The resulting model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve its ability to blend motion styles or to transition smoothly among them.
In separate but related work, we revisit Products of Hidden Markov Models (PoHMMs). We show how the partition function can be estimated reliably via Annealed Importance Sampling. This enables us to demonstrate that PoHMMs outperform various flavours of HMMs on a variety of tasks and metrics, including log likelihood.
|
116 |
Nonparametric Learning in High DimensionsLiu, Han 01 December 2010 (has links)
This thesis develops flexible and principled nonparametric learning algorithms to explore, understand, and predict high dimensional and complex datasets. Such data appear frequently in modern scientific domains and lead to numerous important applications. For example, exploring high dimensional functional magnetic resonance imaging data helps us to better understand brain functionalities; inferring large-scale gene regulatory network is crucial for new drug design and development; detecting anomalies in high dimensional transaction databases is vital for corporate and government security.
Our main results include a rigorous theoretical framework and efficient nonparametric learning algorithms that exploit hidden structures to overcome the curse of dimensionality when analyzing massive high dimensional datasets. These algorithms have strong theoretical guarantees and provide high dimensional nonparametric recipes for many important learning tasks, ranging from unsupervised exploratory data analysis to supervised predictive modeling. In this thesis, we address three aspects:
1 Understanding the statistical theories of high dimensional nonparametric inference, including risk, estimation, and model selection consistency;
2 Designing new methods for different data-analysis tasks, including regression, classification, density estimation, graphical model learning, multi-task learning, spatial-temporal adaptive learning;
3 Demonstrating the usefulness of these methods in scientific applications, including functional genomics, cognitive neuroscience, and meteorology.
In the last part of this thesis, we also present the future vision of high dimensional and large-scale nonparametric inference.
|
117 |
Some problems in the theory & application of graphical modelsRoddam, Andrew Wilfred January 1999 (has links)
A graphical model is simply a representation of the results of an analysis of relationships between sets of variables. It can include the study of the dependence of one variable, or a set of variables on another variable or sets of variables, and can be extended to include variables which could be considered as intermediate to the others. This leads to the concept of representing these chains of relationships by means of a graph; where variables are represented by vertices, and relationships between the variables are represented by edges. These edges can be either directed or undirected, depending upon the type of relationship being represented. The thesis investigates a number of outstanding problems in the area of statistical modelling, with particular emphasis on representing the results in terms of a graph. The thesis will study models for multivariate discrete data and in the case of binary responses, some theoretical results are given on the relationship between two common models. In the more general setting of multivariate discrete responses, a general class of models is studied and an approximation to the maximum likelihood estimates in these models is proposed. This thesis also addresses the problem of measurement errors. An investigation into the effect that measurement error has on sample size calculations is given with respect to a general measurement error specification in both linear and binary regression models. Finally, the thesis presents, in terms of a graphical model, a re-analysis of a set of childhood growth data, collected in South Wales during the 1970s. Within this analysis, a new technique is proposed that allows the calculation of derived variables under the assumption that the joint relationships between the variables are constant at each of the time points.
|
118 |
Graphical Models for Robust Speech Recognition in Adverse EnvironmentsRennie, Steven J. 01 August 2008 (has links)
Robust speech recognition in acoustic environments that contain multiple speech sources and/or complex non-stationary noise is a difficult problem, but one of great practical interest. The formalism of probabilistic graphical models constitutes a relatively new and very powerful tool for better understanding and extending existing
models, learning, and inference algorithms; and a bedrock for the creative, quasi-systematic development of new ones. In this thesis a collection of new graphical models and inference algorithms for robust speech recognition are presented.
The problem of speech separation using multiple microphones is first treated. A family of variational algorithms for tractably combining multiple acoustic models of speech with observed sensor likelihoods is presented. The algorithms recover high quality estimates of the speech sources even when there are more sources than microphones, and have improved upon the state-of-the-art in terms of SNR gain by over 10 dB.
Next the problem of background compensation in non-stationary acoustic environments is treated. A new dynamic noise adaptation (DNA) algorithm for robust noise compensation is presented, and shown to outperform several existing state-of-the-art
front-end denoising systems on the new DNA + Aurora II and Aurora II-M extensions of the Aurora II task.
Finally, the problem of speech recognition in speech using a single microphone is treated. The Iroquois system for multi-talker speech separation and recognition
is presented. The system won the 2006 Pascal International Speech Separation Challenge, and amazingly, achieved super-human recognition performance on a majority of test cases in the task. The result marks a significant first in automatic speech recognition, and a milestone in computing.
|
119 |
Cumulative Distribution Networks: Inference, Estimation and Applications of Graphical Models for Cumulative Distribution FunctionsHuang, Jim C. 01 March 2010 (has links)
This thesis presents a class of graphical models for directly representing the joint cumulative distribution function (CDF) of many random variables, called cumulative distribution networks (CDNs). Unlike graphical models for probability density and mass functions, in a CDN, the marginal probabilities for any subset of variables are obtained by computing limits of functions in the model. We will show that the conditional independence properties in a CDN are distinct from the conditional independence properties of directed, undirected and factor graph models, but include the conditional independence properties of bidirected graphical models. As a result, CDNs are a parameterization for bidirected models that allows us to represent complex statistical dependence relationships between observable variables. We will provide a method for constructing a factor graph model with additional latent variables for which graph separation of variables in the corresponding CDN implies conditional independence of the separated variables in both the CDN and in the factor graph with the latent variables marginalized out. This will then allow us to construct multivariate extreme value distributions for which both a CDN and a corresponding factor graph representation exist.
In order to perform inference in such graphs, we describe the `derivative-sum-product' (DSP) message-passing algorithm where messages correspond to derivatives of the joint cumulative distribution function. We will then apply CDNs to the problem of learning to rank, or estimating parametric models for ranking, where CDNs provide a natural means with which to model multivariate probabilities over ordinal variables such as pairwise preferences. We will show that many previous probability models for rank data, such as the Bradley-Terry and Plackett-Luce models, can be viewed as particular types of CDN. Applications of CDNs will be described for the problems of ranking players in multiplayer team-based games, document retrieval and discovering regulatory sequences in computational biology using the above methods for inference and estimation of CDNs.
|
120 |
Value-adding business process modelling : determining the suitability of a business process modelling technique for a given applicationGeyer, Rian Willem 12 1900 (has links)
Thesis (MScEng)-- Stellenbosch University, 2013. / ENGLISH ABSTRACT: Organizations formally define and document their business processes in order to properly
understand them and to subsequently enable their continuous development, improvement and
management. In order to formally define and document their business processes, organizations can
use Business Process Modelling, which represents the design of graphical models that portray the
business processes of organizations.
It is however noted that it is difficult to select a suitable Business Process Modelling Technique in
support of a specific application of Business Process Modelling. This is due to the considerable
amount of existing Business Process Modelling Techniques, the inherent impact of their varying
capabilities and the lack of formal measures that are available to support evaluations regarding their
suitability for specific modelling applications.
It is therefore considered appropriate to execute a research study that is aimed at the development
and validation of a measurement framework that can be used to evaluate the suitability of Business
Process Modelling Techniques for specific modelling applications. / AFRIKAANSE OPSOMMING: Organisasies definieer en dokumenteer hulle besigheidsprosesse op ʼn formele wyse om hulle
ordentlik te verstaan en gevolglik hulle deurlopende ontwikkeling, verbetering en bestuur te
bemagtig. Ten einde die uitvoering van hierdie aktiwiteit aan te spreek, kan organisasies
Besigheidsproses Modellering gebruik om grafiese modelle van hulle besigheidsprosesse te ontwerp.
Daar word egter kennis geneem dat dit moeilik is om ʼn geskikte Besigheidsproses Modellering
Tegniek te kies tes ondersteuning van ʼn spesifieke toepassing van Besigheidsproses Modellering.
Dit is weens die groot hoeveelheid bestaande Besigheidsproses Modellering Tegnieke, die impak
van hulle variërende vermoëns asook die gebrek aan formele maatstawwe wat gebruik kan word om
hulle geskiktheid vir spesifieke modellering toepassings te evalueer.
Dit lei tot die besluit om ‘n studie te voltooi wat gefokus is op die ontwikkeling en validasie van ʼn
metings raamwerk wat gebruik kan word om die geskiktheid van Besigheidsproses Modellering
Tegnieke vir spesifieke toepassings van Besigheidproses Modellering te evalueer.
|
Page generated in 0.08 seconds