Global ETD Search

41	Modèles linéaires généralisés à effets aléatoires : contributions au choix de modèle et au modèle de mélange Martinez, Marie-José 29 September 2006 (has links) (PDF) Ce travail est consacré à l'étude des modèles linéaires généralisés à effets aléatoires (GL2M). Dans ces modèles, sous une hypothèse de distribution normale des effets aléatoires, la vraisemblance basée sur la distribution marginale du vecteur à expliquer n'est pas, en général, calculable de façon formelle. Dans la première partie de notre travail, nous revisitons différentes méthodes d'estimation non exactes par le biais d'approximations réalisées à différents niveaux selon les raisonnements. La deuxième partie est consacrée à la mise en place de critères de sélection de modèles au sein des GL2M. Nous revenons sur deux méthodes d'estimation nécessitant la construction de modèles linéarisés et nous proposons des critères basés sur la vraisemblance marginale calculée dans le modèle linéarisé obtenu à la convergence de la procédure d'estimation. La troisième et dernière partie s'inscrit dans le cadre des modèles de mélanges de GL2M. Les composants du mélange sont définis par des GL2M et traduisent différents états possibles des individus. Dans le cadre de la loi exponentielle, nous proposons une méthode d'estimation des paramètres du mélange basée sur une linéarisation spécifique à cette loi. Nous proposons ensuite une méthode plus générale puisque s'appliquant à un mélange de GL2M quelconques. Cette méthode s'appuie sur une étape de Metropolis-Hastings pour construire un algorithme de type MCEM. Les différentes méthodes développées sont testées par simulations. [MATH] Mathematics Modèles linéaires généralisés Effets aléatoires Estimation Sélection de modèle Modèle de mélange Algorithme EM Algorithme de Metropolis-Hastings
42	Population SAMC, ChIP-chip Data Analysis and Beyond Wu, Mingqi 2010 December 1900 (has links) This dissertation research consists of two topics, population stochastics approximation Monte Carlo (Pop-SAMC) for Baysian model selection problems and ChIP-chip data analysis. The following two paragraphs give a brief introduction to each of the two topics, respectively. Although the reversible jump MCMC (RJMCMC) has the ability to traverse the space of possible models in Bayesian model selection problems, it is prone to becoming trapped into local mode, when the model space is complex. SAMC, proposed by Liang, Liu and Carroll, essentially overcomes the difficulty in dimension-jumping moves, by introducing a self-adjusting mechanism. However, this learning mechanism has not yet reached its maximum efficiency. In this dissertation, we propose a Pop-SAMC algorithm; it works on population chains of SAMC, which can provide a more efficient self-adjusting mechanism and make use of crossover operator from genetic algorithms to further increase its efficiency. Under mild conditions, the convergence of this algorithm is proved. The effectiveness of Pop-SAMC in Bayesian model selection problems is examined through a change-point identification example and a large-p linear regression variable selection example. The numerical results indicate that Pop- SAMC outperforms both the single chain SAMC and RJMCMC significantly. In the ChIP-chip data analysis study, we developed two methodologies to identify the transcription factor binding sites: Bayesian latent model and population-based test. The former models the neighboring dependence of probes by introducing a latent indicator vector; The later provides a nonparametric method for evaluation of test scores in a multiple hypothesis test by making use of population information of samples. Both methods are applied to real and simulated datasets. The numerical results indicate the Bayesian latent model can outperform the existing methods, especially when the data contain outliers, and the use of population information can significantly improve the power of multiple hypothesis tests. Markov Chain Monte Carlo Stochastic Approximation Metropolis-Hastings Algorithm Bayesian Model Selection ChIP-chip Latent Variable Multiple Hypothesis Test
43	Bayesian Inference in the Multinomial Logit Model Frühwirth-Schnatter, Sylvia, Frühwirth, Rudolf January 2012 (has links) (PDF) The multinomial logit model (MNL) possesses a latent variable representation in terms of random variables following a multivariate logistic distribution. Based on multivariate finite mixture approximations of the multivariate logistic distribution, various data-augmented Metropolis-Hastings algorithms are developed for a Bayesian inference of the MNL model.
44	Inference and prediction in a multiple structural break model of economic time series Jiang, Yu 01 May 2009 (has links) This thesis develops a new Bayesian approach to structural break modeling. The focuses of the approach are the modeling of in-sample structural breaks and forecasting time series allowing out-of-sample breaks. Our model has some desirable features. First, the number of regimes is not fixed and is treated as a random variable in our model. Second, our model adopts a hierarchical prior for regime coefficients, which allows for the regime coefficients of one regime to contain information about regime coefficients of other regimes. However, the regime coefficients can be analytically integrated out of the posterior distribution and therefore we only need to deal with one level of the hierarchy. Third, the implementation of our model is simple and the computational cost is low. Our model is applied to two different time series: S&P 500 monthly returns and U.S. real GDP quarterly growth rates. We linked breaks detected by our model to certain historical events. Markov Chain Monte Carlo Metropolis-Hastings Real GDP Growth S&P 500 Returns Structural Breaks Applied Mathematics
45	Efficient Path and Parameter Inference for Markov Jump Processes Boqian Zhang (6563222) 15 May 2019 (has links) <div>Markov jump processes are continuous-time stochastic processes widely used in a variety of applied disciplines. Inference typically proceeds via Markov chain Monte Carlo (MCMC), the state-of-the-art being a uniformization-based auxiliary variable Gibbs sampler. This was designed for situations where the process parameters are known, and Bayesian inference over unknown parameters is typically carried out by incorporating it into a larger Gibbs sampler. This strategy of sampling parameters given path, and path given parameters can result in poor Markov chain mixing.</div><div><br></div><div>In this thesis, we focus on the problem of path and parameter inference for Markov jump processes.</div><div><br></div><div>In the first part of the thesis, a simple and efficient MCMC algorithm is proposed to address the problem of path and parameter inference for Markov jump processes. Our scheme brings Metropolis-Hastings approaches for discrete-time hidden Markov models to the continuous-time setting, resulting in a complete and clean recipe for parameter and path inference in Markov jump processes. In our experiments, we demonstrate superior performance over Gibbs sampling, a more naive Metropolis-Hastings algorithm we propose, as well as another popular approach, particle Markov chain Monte Carlo. We also show our sampler inherits geometric mixing from an ‘ideal’ sampler that is computationally much more expensive.</div><div><br></div><div>In the second part of the thesis, a novel collapsed variational inference algorithm is proposed. Our variational inference algorithm leverages ideas from discrete-time Markov chains, and exploits a connection between Markov jump processes and discrete-time Markov chains through uniformization. Our algorithm proceeds by marginalizing out the parameters of the Markov jump process, and then approximating the distribution over the trajectory with a factored distribution over segments of a piecewise-constant function. Unlike MCMC schemes that marginalize out transition times of a piecewise-constant process, our scheme optimizes the discretization of time, resulting in significant computational savings. We apply our ideas to synthetic data as well as a dataset of check-in recordings, where we demonstrate superior performance over state-of-the-art MCMC methods.</div><div><br></div> Statistics Continuous-time Markov chain Markov chain Monte Carlo Metropolis-Hastings Algorithm Uniformization Geometric Ergodicity Variational Bayes
46	Flexible Multidimensional Item Response Theory Models Incorporating Response Styles Stanley, Leanne M. 23 October 2017 (has links) No description available. Personality Psychology Quantitative Psychology multidimensional item response theory Big Five personality dimensions
47	Relative Role of Uncertainty for Predictions of Future Southeastern U.S. Pine Carbon Cycling Jersild, Annika Lee 06 July 2016 (has links) Predictions of how forest productivity and carbon sequestration will respond to climate change are essential for making forest management decisions and adapting to future climate. However, current predictions can include considerable uncertainty that is not well quantified. To address the need for better quantification of uncertainty, we calculated and compared ecosystem model parameter, ecosystem model process, climate model, and climate scenario uncertainty for predictions of Southeastern U.S. pine forest productivity. We applied a data assimilation using Metropolis-Hastings Markov Chain Monte Carlo to fuse diverse datasets with the Physiological Principles Predicting Growth model. The spatially and temporally diverse data sets allowed for novel constraints on ecosystem model parameters and allowed for the quantification of uncertainty associated with parameterization and model structure (process). Overall, we found that the uncertainty is higher for parameter and process model uncertainty than the climate model uncertainty. We determined that climate change will result in a likely increase in terrestrial carbon storage and that higher emission scenarios increase the uncertainty in our predictions. In addition, we determined regional variations in biomass accumulation due to a response to the change in frost days, temperature, and vapor pressure deficit. Since the uncertainty associated with ecosystem model parameter and process uncertainty was larger than the uncertainty associated with climate predictions, our results indicate that better constraining parameters in ecosystem models and improving the mathematical structure of ecosystem models can improve future predictions of forest productivity and carbon sequestration. / Master of Science data assimilation carbon sequestration pine plantation management ecosystem modeling identifying uncertainty
48	Accelerating Monte Carlo methods for Bayesian inference in dynamical models Dahlin, Johan January 2016 (has links) Making decisions and predictions from noisy observations are two important and challenging problems in many areas of society. Some examples of applications are recommendation systems for online shopping and streaming services, connecting genes with certain diseases and modelling climate change. In this thesis, we make use of Bayesian statistics to construct probabilistic models given prior information and historical data, which can be used for decision support and predictions. The main obstacle with this approach is that it often results in mathematical problems lacking analytical solutions. To cope with this, we make use of statistical simulation algorithms known as Monte Carlo methods to approximate the intractable solution. These methods enjoy well-understood statistical properties but are often computational prohibitive to employ. The main contribution of this thesis is the exploration of different strategies for accelerating inference methods based on sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC). That is, strategies for reducing the computational effort while keeping or improving the accuracy. A major part of the thesis is devoted to proposing such strategies for the MCMC method known as the particle Metropolis-Hastings (PMH) algorithm. We investigate two strategies: (i) introducing estimates of the gradient and Hessian of the target to better tailor the algorithm to the problem and (ii) introducing a positive correlation between the point-wise estimates of the target. Furthermore, we propose an algorithm based on the combination of SMC and Gaussian process optimisation, which can provide reasonable estimates of the posterior but with a significant decrease in computational effort compared with PMH. Moreover, we explore the use of sparseness priors for approximate inference in over-parametrised mixed effects models and autoregressive processes. This can potentially be a practical strategy for inference in the big data era. Finally, we propose a general method for increasing the accuracy of the parameter estimates in non-linear state space models by applying a designed input signal. / Borde Riksbanken höja eller sänka reporäntan vid sitt nästa möte för att nå inflationsmålet? Vilka gener är förknippade med en viss sjukdom? Hur kan Netflix och Spotify veta vilka filmer och vilken musik som jag vill lyssna på härnäst? Dessa tre problem är exempel på frågor där statistiska modeller kan vara användbara för att ge hjälp och underlag för beslut. Statistiska modeller kombinerar teoretisk kunskap om exempelvis det svenska ekonomiska systemet med historisk data för att ge prognoser av framtida skeenden. Dessa prognoser kan sedan användas för att utvärdera exempelvis vad som skulle hända med inflationen i Sverige om arbetslösheten sjunker eller hur värdet på mitt pensionssparande förändras när Stockholmsbörsen rasar. Tillämpningar som dessa och många andra gör statistiska modeller viktiga för många delar av samhället. Ett sätt att ta fram statistiska modeller bygger på att kontinuerligt uppdatera en modell allteftersom mer information samlas in. Detta angreppssätt kallas för Bayesiansk statistik och är särskilt användbart när man sedan tidigare har bra insikter i modellen eller tillgång till endast lite historisk data för att bygga modellen. En nackdel med Bayesiansk statistik är att de beräkningar som krävs för att uppdatera modellen med den nya informationen ofta är mycket komplicerade. I sådana situationer kan man istället simulera utfallet från miljontals varianter av modellen och sedan jämföra dessa mot de historiska observationerna som finns till hands. Man kan sedan medelvärdesbilda över de varianter som gav bäst resultat för att på så sätt ta fram en slutlig modell. Det kan därför ibland ta dagar eller veckor för att ta fram en modell. Problemet blir särskilt stort när man använder mer avancerade modeller som skulle kunna ge bättre prognoser men som tar för lång tid för att bygga. I denna avhandling använder vi ett antal olika strategier för att underlätta eller förbättra dessa simuleringar. Vi föreslår exempelvis att ta hänsyn till fler insikter om systemet och därmed minska antalet varianter av modellen som behöver undersökas. Vi kan således redan utesluta vissa modeller eftersom vi har en bra uppfattning om ungefär hur en bra modell ska se ut. Vi kan också förändra simuleringen så att den enklare rör sig mellan olika typer av modeller. På detta sätt utforskas rymden av alla möjliga modeller på ett mer effektivt sätt. Vi föreslår ett antal olika kombinationer och förändringar av befintliga metoder för att snabba upp anpassningen av modellen till observationerna. Vi visar att beräkningstiden i vissa fall kan minska ifrån några dagar till någon timme. Förhoppningsvis kommer detta i framtiden leda till att man i praktiken kan använda mer avancerade modeller som i sin tur resulterar i bättre prognoser och beslut. Computational statistics Monte Carlo Markov chains Particle filters Machine learning Bayesian optimisation Approximate Bayesian Computations Gaussian processes Particle Metropolis-Hastings Approximate inference Pseudo-marginal methods
49	Estimation bayésienne nonparamétrique de copules Guillotte, Simon January 2008 (has links) Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal. Bayes Copules Fonction de dépendance Gibbs Matrices doublement stochastiques MCMC Metropolis-Hastings Non-paramétrique Polytope de Birkhoff Prévisions Sauts réversibles
50	Bayesian Models for the Analyzes of Noisy Responses From Small Areas: An Application to Poverty Estimation Manandhar, Binod 26 April 2017 (has links) We implement techniques of small area estimation (SAE) to study consumption, a welfare indicator, which is used to assess poverty in the 2003-2004 Nepal Living Standards Survey (NLSS-II) and the 2001 census. NLSS-II has detailed information of consumption, but it can give estimates only at stratum level or higher. While population variables are available for all households in the census, they do not include the information on consumption; the survey has the `population' variables nonetheless. We combine these two sets of data to provide estimates of poverty indicators (incidence, gap and severity) for small areas (wards, village development committees and districts). Consumption is the aggregate of all food and all non-food items consumed. In the welfare survey the responders are asked to recall all information about consumptions throughout the reference year. Therefore, such data are likely to be noisy, possibly due to response errors or recalling errors. The consumption variable is continuous and positively skewed, so a statistician might use a logarithmic transformation, which can reduce skewness and help meet the normality assumption required for model building. However, it could be problematic since back transformation may produce inaccurate estimates and there are difficulties in interpretations. Without using the logarithmic transformation, we develop hierarchical Bayesian models to link the survey to the census. In our models for consumption, we incorporate the `population' variables as covariates. First, we assume that consumption is noiseless, and it is modeled using three scenarios: the exponential distribution, the gamma distribution and the generalized gamma distribution. Second, we assume that consumption is noisy, and we fit the generalized beta distribution of the second kind (GB2) to consumption. We consider three more scenarios of GB2: a mixture of exponential and gamma distributions, a mixture of two gamma distributions, and a mixture of two generalized gamma distributions. We note that there are difficulties in fitting the models for noisy responses because these models have non-identifiable parameters. For each scenario, after fitting two hierarchical Bayesian models (with and without area effects), we show how to select the most plausible model and we perform a Bayesian data analysis on Nepal's poverty data. We show how to predict the poverty indicators for all wards, village development committees and districts of Nepal (a big data problem) by combining the survey data with the census. This is a computationally intensive problem because Nepal has about four million households with about four thousand households in the survey and there is no record linkage between households in the survey and the census. Finally, we perform empirical studies to assess the quality of our survey-census procedure. Poverty Non-normality Noninformative priors Noisy responses Logarithmic transformation Hierarchical Bayesian GB2 distribution Metropolis Hastings alogrithm Small area estimation

Search results