71 |
Bayesian Adjustment for MultiplicityScott, James Gordon January 2009 (has links)
<p>This thesis is about Bayesian approaches for handling multiplicity. It considers three main kinds of multiple-testing scenarios: tests of exchangeable experimental units, tests for variable inclusion in linear regresson models, and tests for conditional independence in jointly normal vectors. Multiplicity adjustment in these three areas will be seen to have many common structural features. Though the modeling approach throughout is Bayesian, frequentist reasoning regarding error rates will often be employed.</p><p>Chapter 1 frames the issues in the context of historical debates about Bayesian multiplicity adjustment. Chapter 2 confronts the problem of large-scale screening of functional data, where control over Type-I error rates is a crucial issue. Chapter 3 develops new theory for comparing Bayes and empirical-Bayes approaches for multiplicity correction in regression variable selection. Chapters 4 and 5 describe new theoretical and computational tools for Gaussian graphical-model selection, where multiplicity arises in performing many simultaneous tests of pairwise conditional independence. Chapter 6 introduces a new approach to sparse-signal modeling based upon local shrinkage rules. Here the focus is not on multiplicity per se, but rather on using ideas from Bayesian multiple-testing models to motivate a new class of multivariate scale-mixture priors. Finally, Chapter 7 describes some directions for future study, many of which are the subjects of my current research agenda.</p> / Dissertation
|
72 |
Bayesian Sparse Learning for High Dimensional DataShi, Minghui January 2011 (has links)
<p>In this thesis, we develop some Bayesian sparse learning methods for high dimensional data analysis. There are two important topics that are related to the idea of sparse learning -- variable selection and factor analysis. We start with Bayesian variable selection problem in regression models. One challenge in Bayesian variable selection is to search the huge model space adequately, while identifying high posterior probability regions. In the past decades, the main focus has been on the use of Markov chain Monte Carlo (MCMC) algorithms for these purposes. In the first part of this thesis, instead of using MCMC, we propose a new computational approach based on sequential Monte Carlo (SMC), which we refer to as particle stochastic search (PSS). We illustrate PSS through applications to linear regression and probit models.</p><p>Besides the Bayesian stochastic search algorithms, there is a rich literature on shrinkage and variable selection methods for high dimensional regression and classification with vector-valued parameters, such as lasso (Tibshirani, 1996) and the relevance vector machine (Tipping, 2001). Comparing with the Bayesian stochastic search algorithms, these methods does not account for model uncertainty but are more computationally efficient. In the second part of this thesis, we generalize this type of ideas to matrix valued parameters and focus on developing efficient variable selection method for multivariate regression. We propose a Bayesian shrinkage model (BSM) and an efficient algorithm for learning the associated parameters .</p><p>In the third part of this thesis, we focus on the topic of factor analysis which has been widely used in unsupervised learnings. One central problem in factor analysis is related to the determination of the number of latent factors. We propose some Bayesian model selection criteria for selecting the number of latent factors based on a graphical factor model. As it is illustrated in Chapter 4, our proposed method achieves good performance in correctly selecting the number of factors in several different settings. As for application, we implement the graphical factor model for several different purposes, such as covariance matrix estimation, latent factor regression and classification.</p> / Dissertation
|
73 |
Bayesian variable selection in clustering via dirichlet process mixture modelsKim, Sinae 17 September 2007 (has links)
The increased collection of high-dimensional data in various fields has raised a strong
interest in clustering algorithms and variable selection procedures. In this disserta-
tion, I propose a model-based method that addresses the two problems simultane-
ously. I use Dirichlet process mixture models to define the cluster structure and to
introduce in the model a latent binary vector to identify discriminating variables. I
update the variable selection index using a Metropolis algorithm and obtain inference
on the cluster structure via a split-merge Markov chain Monte Carlo technique. I
evaluate the method on simulated data and illustrate an application with a DNA
microarray study. I also show that the methodology can be adapted to the problem
of clustering functional high-dimensional data. There I employ wavelet thresholding
methods in order to reduce the dimension of the data and to remove noise from the
observed curves. I then apply variable selection and sample clustering methods in the
wavelet domain. Thus my methodology is wavelet-based and aims at clustering the
curves while identifying wavelet coefficients describing discriminating local features.
I exemplify the method on high-dimensional and high-frequency tidal volume traces
measured under an induced panic attack model in normal humans.
|
74 |
Statistical validation and calibration of computer modelsLiu, Xuyuan 21 January 2011 (has links)
This thesis deals with modeling, validation and calibration problems in experiments of computer models. Computer models are mathematic representations of real systems developed for understanding and investigating the systems. Before a computer model
is used, it often needs to be validated by comparing the computer outputs with physical observations and calibrated by adjusting internal model parameters in order to improve the agreement between the computer outputs and physical observations.
As computer models become more powerful and popular, the complexity of input and output data raises new computational challenges and stimulates the development of novel statistical modeling methods.
One challenge is to deal with computer models with random inputs (random effects). This kind of computer models is very common in engineering applications. For example, in a thermal experiment in the Sandia National Lab (Dowding et al. 2008), the volumetric heat capacity and thermal conductivity are random input variables. If input variables are randomly sampled from particular distributions with unknown parameters, the existing methods in the literature are not directly applicable. The reason is that integration over the random variable distribution is needed for the joint likelihood and the integration cannot always be expressed in a closed form. In this research, we propose a new approach which combines the nonlinear mixed effects model and the Gaussian process model (Kriging model). Different model formulations are also studied to have an better understanding of validation and calibration activities by using the thermal problem.
Another challenge comes from computer models with functional outputs. While many methods have been developed for modeling computer experiments with single response, the literature on modeling computer experiments with functional response is sketchy. Dimension reduction techniques can be used to overcome the complexity problem of function response; however, they generally involve two steps. Models are first fit at each individual setting of the input to reduce the dimensionality of the functional data. Then the estimated parameters of the models are treated as new responses, which are further modeled for prediction. Alternatively, pointwise models are first constructed at each time point and then functional curves are fit to the parameter estimates obtained from the fitted models. In this research, we first propose a functional regression model to relate functional responses to both design and time variables in one single step. Secondly, we propose a functional kriging model which uses variable selection methods by imposing a penalty function. we show that the proposed model performs better than dimension reduction based approaches and the kriging model without regularization. In addition, non-asymptotic theoretical bounds on the estimation error are presented.
|
75 |
Bayesian parsimonious covariance estimation for hierarchical linear mixed modelsFrühwirth-Schnatter, Sylvia, Tüchler, Regina January 2004 (has links) (PDF)
We considered a non-centered parameterization of the standard random-effects model, which is based on the Cholesky decomposition of the variance-covariance matrix. The regression type structure of the non-centered parameterization allows to choose a simple, conditionally conjugate normal prior on the Cholesky factor. Based on the non-centered parameterization, we search for a parsimonious variance-covariance matrix by identifying the non-zero elements of the Cholesky factors using Bayesian variable selection methods. With this method we are able to learn from the data for each effect, whether it is random or not, and whether covariances among random effects are zero or not. An application in marketing shows a substantial reduction of the number of free elements of the variance-covariance matrix. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
76 |
Bayesian Hierarchical Models for Model ChoiceLi, Yingbo January 2013 (has links)
<p>With the development of modern data collection approaches, researchers may collect hundreds to millions of variables, yet may not need to utilize all explanatory variables available in predictive models. Hence, choosing models that consist of a subset of variables often becomes a crucial step. In linear regression, variable selection not only reduces model complexity, but also prevents over-fitting. From a Bayesian perspective, prior specification of model parameters plays an important role in model selection as well as parameter estimation, and often prevents over-fitting through shrinkage and model averaging.</p><p>We develop two novel hierarchical priors for selection and model averaging, for Generalized Linear Models (GLMs) and normal linear regression, respectively. They can be considered as "spike-and-slab" prior distributions or more appropriately "spike- and-bell" distributions. Under these priors we achieve dimension reduction, since their point masses at zero allow predictors to be excluded with positive posterior probability. In addition, these hierarchical priors have heavy tails to provide robust- ness when MLE's are far from zero.</p><p>Zellner's g-prior is widely used in linear models. It preserves correlation structure among predictors in its prior covariance, and yields closed-form marginal likelihoods which leads to huge computational savings by avoiding sampling in the parameter space. Mixtures of g-priors avoid fixing g in advance, and can resolve consistency problems that arise with fixed g. For GLMs, we show that the mixture of g-priors using a Compound Confluent Hypergeometric distribution unifies existing choices in the literature and maintains their good properties such as tractable (approximate) marginal likelihoods and asymptotic consistency for model selection and parameter estimation under specific values of the hyper parameters.</p><p>While the g-prior is invariant under rotation within a model, a potential problem with the g-prior is that it inherits the instability of ordinary least squares (OLS) estimates when predictors are highly correlated. We build a hierarchical prior based on scale mixtures of independent normals, which incorporates invariance under rotations within models like ridge regression and the g-prior, but has heavy tails like the Zeller-Siow Cauchy prior. We find this method out-performs the gold standard mixture of g-priors and other methods in the case of highly correlated predictors in Gaussian linear models. We incorporate a non-parametric structure, the Dirichlet Process (DP) as a hyper prior, to allow more flexibility and adaptivity to the data.</p> / Dissertation
|
77 |
Bayesian Variable Selection in Spatial Autoregressive ModelsCrespo Cuaresma, Jesus, Piribauer, Philipp 07 1900 (has links) (PDF)
This paper compares the performance of Bayesian variable selection approaches for spatial autoregressive models. We present two alternative approaches which can be implemented using Gibbs sampling methods in a straightforward way and allow us to deal with the problem of model uncertainty in spatial autoregressive models in a flexible and computationally efficient way. In a simulation study we show that the variable selection approaches tend to outperform existing Bayesian model averaging techniques both in terms of in-sample predictive performance and computational efficiency.
(authors' abstract) / Series: Department of Economics Working Paper Series
|
78 |
Exact Markov chain Monte Carlo and Bayesian linear regressionBentley, Jason Phillip January 2009 (has links)
In this work we investigate the use of perfect sampling methods within the context of Bayesian linear regression. We focus on inference problems related to the marginal posterior model probabilities. Model averaged inference for the response and Bayesian variable selection are considered. Perfect sampling is an alternate form of Markov chain Monte Carlo that generates exact sample points from the posterior of interest. This approach removes the need for burn-in assessment faced by traditional MCMC methods. For model averaged inference, we find the monotone Gibbs coupling from the past (CFTP) algorithm is the preferred choice. This requires the predictor matrix be orthogonal, preventing variable selection, but allowing model averaging for prediction of the response. Exploring choices of priors for the parameters in the Bayesian linear model, we investigate sufficiency for monotonicity assuming Gaussian errors. We discover that a number of other sufficient conditions exist, besides an orthogonal predictor matrix, for the construction of a monotone Gibbs Markov chain. Requiring an orthogonal predictor matrix, we investigate new methods of orthogonalizing the original predictor matrix. We find that a new method using the modified Gram-Schmidt orthogonalization procedure performs comparably with existing transformation methods, such as generalized principal components. Accounting for the effect of using an orthogonal predictor matrix, we discover that inference using model averaging for in-sample prediction of the response is comparable between the original and orthogonal predictor matrix. The Gibbs sampler is then investigated for sampling when using the original predictor matrix and the orthogonal predictor matrix. We find that a hybrid method, using a standard Gibbs sampler on the orthogonal space in conjunction with the monotone CFTP Gibbs sampler, provides the fastest computation and convergence to the posterior distribution. We conclude the hybrid approach should be used when the monotone Gibbs CFTP sampler becomes impractical, due to large backwards coupling times. We demonstrate large backwards coupling times occur when the sample size is close to the number of predictors, or when hyper-parameter choices increase model competition. The monotone Gibbs CFTP sampler should be taken advantage of when the backwards coupling time is small. For the problem of variable selection we turn to the exact version of the independent Metropolis-Hastings (IMH) algorithm. We reiterate the notion that the exact IMH sampler is redundant, being a needlessly complicated rejection sampler. We then determine a rejection sampler is feasible for variable selection when the sample size is close to the number of predictors and using Zellner’s prior with a small value for the hyper-parameter c. Finally, we use the example of simulating from the posterior of c conditional on a model to demonstrate how the use of an exact IMH view-point clarifies how the rejection sampler can be adapted to improve efficiency.
|
79 |
Réduction de dimension via Sliced Inverse Regression : Idées et nouvelles propositions / Dimension reductio via Sliced Inverse Regression : ideas and extensionsChiancone, Alessandro 28 October 2016 (has links)
Cette thèse propose trois extensions de la Régression linéaire par tranches (Sliced Inverse Regression, SIR), notamment Collaborative SIR, Student SIR et Knockoff SIR.Une des faiblesses de la méthode SIR est l’impossibilité de vérifier si la Linearity Design Condition (LDC) est respectée. Il est établi que, si x suit une distribution elliptique, la condition est vraie ; dans le cas d’une composition de distributions elliptiques il n y a aucune garantie que la condition soit vérifiée globalement, pourtant, elle est respectée localement.On va donc proposer une extension sur la base de cette considération. Étant donné une variable explicative x, Collaborative SIR réalise d’abord un clustering. Pour chaque cluster, la méthode SIR est appliquée de manière indépendante.Le résultat de chaque composant contribue à créer la solution finale.Le deuxième papier, Student SIR, dérive de la nécessité de robustifier la méthode SIR.Vu que cette dernière repose sur l’estimation de la covariance et contient une étape APC, alors elle est sensible au bruit.Afin d’étendre la méthode SIR on a utilisé une stratégie fondée sur une formulation inverse du SIR, proposée par R.D. Cook.Finalement, Knockoff SIR est une extension de la méthode SIR pour la sélection des variables et la recherche d’une solution sparse, ayant son fondement dans le papier publié par R.F. Barber et E.J. Candès qui met l’accent sur le false discovery rate dans le cadre de la régression. L’idée sous-jacente à notre papier est de créer des copies de variables d’origine ayant certaines proprietés.On va montrer que la méthode SIR est robuste par rapport aux copies et on va proposer une stratégie pour utiliser les résultats dans la sélection des variables et pour générer des solutions sparse / This thesis proposes three extensions of Sliced Inverse Regression namely: Collaborative SIR, Student SIR and Knockoff SIR.One of the weak points of SIR is the impossibility to check if the Linearity Design Condition (LDC) holds. It is known that if X follows an elliptic distribution thecondition holds true, in case of a mixture of elliptic distributions there are no guaranties that the condition is satisfied globally, but locally holds. Starting from this consideration an extension is proposed. Given the predictor variable X, Collaborative SIR performs initially a clustering. In each cluster, SIR is applied independently. The result from each component collaborates to give the final solution.Our second contribution, Student SIR, comes from the need to robustify SIR. Since SIR is based on the estimation of the covariance, and contains a PCA step, it is indeed sensitive to noise. To extend SIR, an approach based on a inverse formulation of SIR proposed by R.D. Cook has been used.Finally Knockoff SIR is an extension of SIR to perform variable selection and give sparse solution that has its foundations in a recently published paper by R. F. Barber and E. J. Candès that focuses on the false discovery rate in the regression framework. The underlying idea of this paper is to construct copies of the original variables that have some properties. It is shown that SIR is robust to this copies and a strategy is proposed to use this result for variable selection and to generate sparse solutions.
|
80 |
A importância do ponto de operação nas técnicas de self-optimizing controlSchultz, Eduardo dos Santos January 2015 (has links)
A otimização de processos vem se tornando uma ferramenta fundamental para o aumento da lucratividade das plantas químicas. Diversos métodos de otimização foram propostos ao longo dos anos, sendo que a otimização em tempo real (RTO) é a solução mais consolidada industrialmente, enquanto que o self-optimizing control (SOC) surge como uma alternativa simplificada, com um menor custo de implantação em relação a esse. Neste trabalho são estudados diversos aspectos da metodologia de SOC, iniciando pela análise do impacto do ponto de operação para o desenvolvimento de estruturas de controle auto-otimizáveis. São propostas modificações na formulação do problema de otimização de SOC de modo que as variáveis controladas sejam determinadas no mesmo problema de otimização em que é escolhido o ponto de operação, permitindo a redução da perda do processo. De forma a analisar a influência da dinâmica nos resultados obtidos, é realizado um estudo comparativo da perda gerada no processo ao longo da operação para as estruturas de otimização baseadas em RTO e em SOC. Com base nos resultados obtidos para uma unidade didática, mostra-se que o comportamento dinâmico do distúrbio possui grande influência na escolha da técnica de otimização, quebrando a ideia de que o RTO é um limite superior do SOC. A aplicação industrial das técnicas clássicas de SOC é validada em uma unidade de separação de propeno, baseada em uma unidade real em operação. A partir da modelagem do processo em simulador comercial, foram geradas as variáveis controladas que permitam uma perda aceitável para a unidade, comprovando a viabilidade de implantação da metodologia em unidades reais. / Process optimization has become a fundamental tool for increasing chemical plants profit. Several optimization methods have been proposed over the years, and real-time optimization (RTO) is the most consolidated solution industrially while self-optimizing control (SOC) appears as a simplified alternative with a lower implementation cost. In this work several aspects of SOC methodology are studied, starting from the analysis of the impact of operating point in the development of self-optimizing control structures. Improvements are proposed in SOC optimization problem formulation where controlled variables are determined in the same optimization problem that operating point, thus reducing significantly process loss. In order to analyze the influence of dynamics on the results, a comparative study is accomplished comparing the loss generated in the process throughout the operation for optimization structures based on RTO and SOC. With the results generated for a toy unit, it is shown that the disturbance dynamic behavior has a great influence on choosing the optimization technique, breaking the idea that RTO is an upper limit of SOC. The industrial application of classical SOC techniques is tested on a propylene separation unit, really operating nowadays. The process was modelled in a commercial simulator and with this model it was generated the best set of controlled variables, based on SOC, that achieve an acceptable loss for the unit, showing that the methodology can be applied in in real units.
|
Page generated in 0.1044 seconds