Global ETD Search

1	topicmodels: An R Package for Fitting Topic Models Hornik, Kurt, Grün, Bettina January 2011 (has links) (PDF) Topic models allow the probabilistic modeling of term frequency occurrences in documents. The fitted model can be used to estimate the similarity between documents as well as between a set of specified keywords using an additional layer of latent variables which are referred to as topics. The R package topicmodels provides basic infrastructure for fitting topic models based on data structures from the text mining package tm. The package includes interfaces to two algorithms for fitting topic models: the variational expectation-maximization algorithm provided by David M. Blei and co-authors and an algorithm using Gibbs sampling by Xuan-Hieu Phan and co-authors.
2	Champs aléatoires de Markov cachés pour la cartographie du risque en épidémiologie / Hidden Markov random fields for risk mapping in epidemiology Azizi, Lamiae 13 December 2011 (has links) La cartographie du risque en épidémiologie permet de mettre en évidence des régionshomogènes en terme du risque aﬁn de mieux comprendre l’étiologie des maladies. Nousabordons la cartographie automatique d’unités géographiques en classes de risque commeun problème de classiﬁcation à l’aide de modèles de Markov cachés discrets et de modèlesde mélange de Poisson. Le modèle de Markov caché proposé est une variante du modèle dePotts, où le paramètre d’interaction dépend des classes de risque.Aﬁn d’estimer les paramètres du modèle, nous utilisons l’algorithme EM combiné à une approche variationnelle champ-moyen. Cette approche nous permet d’appliquer l’algorithmeEM dans un cadre spatial et présente une alternative efﬁcace aux méthodes d’estimation deMonte Carlo par chaîne de Markov (MCMC).Nous abordons également les problèmes d’initialisation, spécialement quand les taux de risquesont petits (cas des maladies animales). Nous proposons une nouvelle stratégie d’initialisationappropriée aux modèles de mélange de Poisson quand les classes sont mal séparées. Pourillustrer ces solutions proposées, nous présentons des résultats d’application sur des jeux dedonnées épidémiologiques animales fournis par l’INRA. / The analysis of the geographical variations of a disease and their representation on a mapis an important step in epidemiology. The goal is to identify homogeneous regions in termsof disease risk and to gain better insights into the mechanisms underlying the spread of thedisease. We recast the disease mapping issue of automatically classifying geographical unitsinto risk classes as a clustering task using a discrete hidden Markov model and Poisson classdependent distributions. The designed hidden Markov prior is non standard and consists of avariation of the Potts model where the interaction parameter can depend on the risk classes.The model parameters are estimated using an EM algorithm and the mean ﬁeld approximation. This provides a way to face the intractability of the standard EM in this spatial context,with a computationally efﬁcient alternative to more intensive simulation based Monte CarloMarkov Chain (MCMC) procedures.We then focus on the issue of dealing with very low risk values and small numbers of observedcases and population sizes. We address the problem of ﬁnding good initial parameter values inthis context and develop a new initialization strategy appropriate for spatial Poisson mixturesin the case of not so well separated classes as encountered in animal disease risk analysis.We illustrate the performance of the proposed methodology on some animal epidemiologicaldatasets provided by INRA. Classiﬁcation Cartographie du risque Mélanges de Poisson Modèle de Potts EM champ-moyen Classiﬁcation Discrete hidden Markov random ﬁeld Disease mapping Poisson mixtures Potts model Variational EM 510

Search results

topicmodels: An R Package for Fitting Topic Models

Champs aléatoires de Markov cachés pour la cartographie du risque en épidémiologie / Hidden Markov random fields for risk mapping in epidemiology