Spelling suggestions: "subject:"datent ariable codels"" "subject:"datent ariable 2models""
21 |
Probabilistic models in noisy environments : and their application to a visual prosthesis for the blindArchambeau, Cédric 26 September 2005 (has links)
In recent years, probabilistic models have become fundamental techniques in machine learning. They are successfully applied in various engineering problems, such as robotics, biometrics, brain-computer interfaces or artificial vision, and will gain in importance in the near future. This work deals with the difficult, but common situation where the data is, either very noisy, or scarce compared to the complexity of the process to model. We focus on latent variable models, which can be formalized as probabilistic graphical models and learned by the expectation-maximization algorithm or its variants (e.g., variational Bayes).<br>
After having carefully studied a non-exhaustive list of multivariate kernel density estimators, we established that in most applications locally adaptive estimators should be preferred. Unfortunately, these methods are usually sensitive to outliers and have often too many parameters to set. Therefore, we focus on finite mixture models, which do not suffer from these drawbacks provided some structural modifications.<br>
Two questions are central in this dissertation: (i) how to make mixture models robust to noise, i.e. deal efficiently with outliers, and (ii) how to exploit side-channel information, i.e. additional information intrinsic to the data. In order to tackle the first question, we extent the training algorithms of the popular Gaussian mixture models to the Student-t mixture models. the Student-t distribution can be viewed as a heavy-tailed alternative to the Gaussian distribution, the robustness being tuned by an extra parameter, the degrees of freedom. Furthermore, we introduce a new variational Bayesian algorithm for learning Bayesian Student-t mixture models. This algorithm leads to very robust density estimators and clustering. To address the second question, we introduce manifold constrained mixture models. This new technique exploits the information that the data is living on a manifold of lower dimension than the dimension of the feature space. Taking the implicit geometrical data arrangement into account results in better generalization on unseen data.<br>
Finally, we show that the latent variable framework used for learning mixture models can be extended to construct probabilistic regularization networks, such as the Relevance Vector Machines. Subsequently, we make use of these methods in the context of an optic nerve visual prosthesis to restore partial vision to blind people of whom the optic nerve is still functional. Although visual sensations can be induced electrically in the blind's visual field, the coding scheme of the visual information along the visual pathways is poorly known. Therefore, we use probabilistic models to link the stimulation parameters to the features of the visual perceptions. Both black-box and grey-box models are considered. The grey-box models take advantage of the known neurophysiological information and are more instructive to medical doctors and psychologists.<br>
|
22 |
Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking VariablesKatsikatsou, Myrsini January 2013 (has links)
The estimation of latent variable models with ordinal and continuous, or ranking variables is the research focus of this thesis. The existing estimation methods are discussed and a composite likelihood approach is developed. The main advantages of the new method are its low computational complexity which remains unchanged regardless of the model size, and that it yields an asymptotically unbiased, consistent, and normally distributed estimator. The thesis consists of four papers. The first one investigates the two main formulations of the unrestricted Thurstonian model for ranking data along with the corresponding identification constraints. It is found that the extra identifications constraints required in one of them lead to unreliable estimates unless the constraints coincide with the true values of the fixed parameters. In the second paper, a pairwise likelihood (PL) estimation is developed for factor analysis models with ordinal variables. The performance of PL is studied in terms of bias and mean squared error (MSE) and compared with that of the conventional estimation methods via a simulation study and through some real data examples. It is found that the PL estimates and standard errors have very small bias and MSE both decreasing with the sample size, and that the method is competitive to the conventional ones. The results of the first two papers lead to the next one where PL estimation is adjusted to the unrestricted Thurstonian ranking model. As before, the performance of the proposed approach is studied through a simulation study with respect to relative bias and relative MSE and in comparison with the conventional estimation methods. The conclusions are similar to those of the second paper. The last paper extends the PL estimation to the whole structural equation modeling framework where data may include both ordinal and continuous variables as well as covariates. The approach is demonstrated through an example run in R software. The code used has been incorporated in the R package lavaan (version 0.5-11).
|
23 |
Bayesian Latent Variable Models for Biostatistical ApplicationsRidall, Peter Gareth January 2004 (has links)
In this thesis we develop several kinds of latent variable models in order to address three types of bio-statistical problem. The three problems are the treatment effect of carcinogens on tumour development, spatial interactions between plant species and motor unit number estimation (MUNE). The three types of data looked at are: highly heterogeneous longitudinal count data, quadrat counts of species on a rectangular lattice and lastly, electrophysiological data consisting of measurements of compound muscle action potential (CMAP) area and amplitude. Chapter 1 sets out the structure and the development of ideas presented in this thesis from the point of view of: model structure, model selection, and efficiency of estimation. Chapter 2 is an introduction to the relevant literature that has in influenced the development of this thesis. In Chapter 3 we use the EM algorithm for an application of an autoregressive hidden Markov model to describe longitudinal counts. The data is collected from experiments to test the effect of carcinogens on tumour growth in mice. Here we develop forward and backward recursions for calculating the likelihood and for estimation. Chapter 4 is the analysis of a similar kind of data using a more sophisticated model, incorporating random effects, but estimation this time is conducted from the Bayesian perspective. Bayesian model selection is also explored. In Chapter 5 we move to the two dimensional lattice and construct a model for describing the spatial interaction of tree types. We also compare the merits of directed and undirected graphical models for describing the hidden lattice. Chapter 6 is the application of a Bayesian hierarchical model (MUNE), where the latent variable this time is multivariate Gaussian and dependent on a covariate, the stimulus. Model selection is carried out using the Bayes Information Criterion (BIC). In Chapter 7 we approach the same problem by using the reversible jump methodology (Green, 1995) where this time we use a dual Gaussian-Binary representation of the latent data. We conclude in Chapter 8 with suggestions for the direction of new work. In this thesis, all of the estimation carried out on real data has only been performed once we have been satisfied that estimation is able to retrieve the parameters from simulated data. Keywords: Amyotrophic lateral sclerosis (ALS), carcinogens, hidden Markov models (HMM), latent variable models, longitudinal data analysis, motor unit disease (MND), partially ordered Markov models (POMMs), the pseudo auto- logistic model, reversible jump, spatial interactions.
|
24 |
Recurrent neural network language generation for dialogue systemsWen, Tsung-Hsien January 2018 (has links)
Language is the principal medium for ideas, while dialogue is the most natural and effective way for humans to interact with and access information from machines. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact on usability and perceived quality. Many commonly used NLG systems employ rules and heuristics, which tend to generate inflexible and stylised responses without the natural variation of human language. However, the frequent repetition of identical output forms can quickly make dialogue become tedious for most real-world users. Additionally, these rules and heuristics are not scalable and hence not trivially extensible to other domains or languages. A statistical approach to language generation can learn language decisions directly from data without relying on hand-coded rules or heuristics, which brings scalability and flexibility to NLG. Statistical models also provide an opportunity to learn in-domain human colloquialisms and cross-domain model adaptations. A robust and quasi-supervised NLG model is proposed in this thesis. The model leverages a Recurrent Neural Network (RNN)-based surface realiser and a gating mechanism applied to input semantics. The model is motivated by the Long-Short Term Memory (LSTM) network. The RNN-based surface realiser and gating mechanism use a neural network to learn end-to-end language generation decisions from input dialogue act and sentence pairs; it also integrates sentence planning and surface realisation into a single optimisation problem. The single optimisation not only bypasses the costly intermediate linguistic annotations but also generates more natural and human-like responses. Furthermore, a domain adaptation study shows that the proposed model can be readily adapted and extended to new dialogue domains via a proposed recipe. Continuing the success of end-to-end learning, the second part of the thesis speculates on building an end-to-end dialogue system by framing it as a conditional generation problem. The proposed model encapsulates a belief tracker with a minimal state representation and a generator that takes the dialogue context to produce responses. These features suggest comprehension and fast learning. The proposed model is capable of understanding requests and accomplishing tasks after training on only a few hundred human-human dialogues. A complementary Wizard-of-Oz data collection method is also introduced to facilitate the collection of human-human conversations from online workers. The results demonstrate that the proposed model can talk to human judges naturally, without any difficulty, for a sample application domain. In addition, the results also suggest that the introduction of a stochastic latent variable can help the system model intrinsic variation in communicative intention much better.
|
25 |
Inference and applications for topic models / Inférence et applications pour les modèles thématiquesDupuy, Christophe 30 June 2017 (has links)
La plupart des systèmes de recommandation actuels se base sur des évaluations sous forme de notes (i.e., chiffre entre 0 et 5) pour conseiller un contenu (film, restaurant...) à un utilisateur. Ce dernier a souvent la possibilité de commenter ce contenu sous forme de texte en plus de l'évaluer. Il est difficile d'extraire de l'information d'un texte brut tandis qu'une simple note contient peu d'information sur le contenu et l'utilisateur. Dans cette thèse, nous tentons de suggérer à l'utilisateur un texte lisible personnalisé pour l'aider à se faire rapidement une opinion à propos d'un contenu. Plus spécifiquement, nous construisons d'abord un modèle thématique prédisant une description de film personnalisée à partir de commentaires textuels. Notre modèle sépare les thèmes qualitatifs (i.e., véhiculant une opinion) des thèmes descriptifs en combinant des commentaires textuels et des notes sous forme de nombres dans un modèle probabiliste joint. Nous évaluons notre modèle sur une base de données IMDB et illustrons ses performances à travers la comparaison de thèmes. Nous étudions ensuite l'inférence de paramètres dans des modèles à variables latentes à grande échelle, incluant la plupart des modèles thématiques. Nous proposons un traitement unifié de l'inférence en ligne pour les modèles à variables latentes à partir de familles exponentielles non-canoniques et faisons explicitement apparaître les liens existants entre plusieurs méthodes fréquentistes et Bayesiennes proposées auparavant. Nous proposons aussi une nouvelle méthode d'inférence pour l'estimation fréquentiste des paramètres qui adapte les méthodes MCMC à l'inférence en ligne des modèles à variables latentes en utilisant proprement un échantillonnage de Gibbs local. Pour le modèle thématique d'allocation de Dirichlet latente, nous fournissons une vaste série d'expériences et de comparaisons avec des travaux existants dans laquelle notre nouvelle approche est plus performante que les méthodes proposées auparavant. Enfin, nous proposons une nouvelle classe de processus ponctuels déterminantaux (PPD) qui peut être manipulée pour l'inférence et l'apprentissage de paramètres en un temps potentiellement sous-linéaire en le nombre d'objets. Cette classe, basée sur une factorisation spécifique de faible rang du noyau marginal, est particulièrement adaptée à une sous-classe de PPD continus et de PPD définis sur un nombre exponentiel d'objets. Nous appliquons cette classe à la modélisation de documents textuels comme échantillons d'un PPD sur les phrases et proposons une formulation du maximum de vraisemblance conditionnel pour modéliser les proportions de thèmes, ce qui est rendu possible sans aucune approximation avec notre classe de PPD. Nous présentons une application à la synthèse de documents avec un PPD sur 2 à la puissance 500 objets, où les résumés sont composés de phrases lisibles. / Most of current recommendation systems are based on ratings (i.e. numbers between 0 and 5) and try to suggest a content (movie, restaurant...) to a user. These systems usually allow users to provide a text review for this content in addition to ratings. It is hard to extract useful information from raw text while a rating does not contain much information on the content and the user. In this thesis, we tackle the problem of suggesting personalized readable text to users to help them make a quick decision about a content. More specifically, we first build a topic model that predicts personalized movie description from text reviews. Our model extracts distinct qualitative (i.e., which convey opinion) and descriptive topics by combining text reviews and movie ratings in a joint probabilistic model. We evaluate our model on an IMDB dataset and illustrate its performance through comparison of topics. We then study parameter inference in large-scale latent variable models, that include most topic models. We propose a unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We also propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods to online inference of latent variable models with the proper use of local Gibbs sampling.~For the specific latent Dirichlet allocation topic model, we provide an extensive set of experiments and comparisons with existing work, where our new approach outperforms all previously proposed methods. Finally, we propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling text documents as sampling a DPP of sentences, and propose a conditional maximum likelihood formulation to model topic proportions, which is made possible with no approximation for our class of DPPs. We present an application to document summarization with a DPP on 2 to the power 500 items, where the summaries are composed of readable sentences.
|
26 |
Learning Stochastic Nonlinear Dynamical Systems Using Non-stationary Linear PredictorsAbdalmoaty, Mohamed January 2017 (has links)
The estimation problem of stochastic nonlinear parametric models is recognized to be very challenging due to the intractability of the likelihood function. Recently, several methods have been developed to approximate the maximum likelihood estimator and the optimal mean-square error predictor using Monte Carlo methods. Albeit asymptotically optimal, these methods come with several computational challenges and fundamental limitations. The contributions of this thesis can be divided into two main parts. In the first part, approximate solutions to the maximum likelihood problem are explored. Both analytical and numerical approaches, based on the expectation-maximization algorithm and the quasi-Newton algorithm, are considered. While analytic approximations are difficult to analyze, asymptotic guarantees can be established for methods based on Monte Carlo approximations. Yet, Monte Carlo methods come with their own computational difficulties; sampling in high-dimensional spaces requires an efficient proposal distribution to reduce the number of required samples to a reasonable value. In the second part, relatively simple prediction error method estimators are proposed. They are based on non-stationary one-step ahead predictors which are linear in the observed outputs, but are nonlinear in the (assumed known) input. These predictors rely only on the first two moments of the model and the computation of the likelihood function is not required. Consequently, the resulting estimators are defined via analytically tractable objective functions in several relevant cases. It is shown that, under mild assumptions, the estimators are consistent and asymptotically normal. In cases where the first two moments are analytically intractable due to the complexity of the model, it is possible to resort to vanilla Monte Carlo approximations. Several numerical examples demonstrate a good performance of the suggested estimators in several cases that are usually considered challenging. / <p>QC 20171128</p>
|
Page generated in 0.0887 seconds