21 |
The effects of three different priors for variance parameters in the normal-mean hierarchical modelChen, Zhu, 1985- 01 December 2010 (has links)
Many prior distributions are suggested for variance parameters in the hierarchical model. The “Non-informative” interval of the conjugate inverse-gamma prior might cause problems. I consider three priors – conjugate inverse-gamma, log-normal and truncated normal for the variance parameters and do the numerical analysis on Gelman’s 8-schools data. Then with the posterior draws, I compare the Bayesian credible intervals of parameters using the three priors. I use predictive distributions to do predictions and then discuss the differences of the three priors suggested. / text
|
22 |
Massively Parallel Dimension Independent Adaptive MetropolisChen, Yuxin 14 May 2015 (has links)
This work considers black-box Bayesian inference over high-dimensional parameter spaces. The well-known and widely respected adaptive Metropolis (AM) algorithm is extended herein to asymptotically scale uniformly with respect to the underlying parameter dimension, by respecting the variance, for Gaussian targets. The result- ing algorithm, referred to as the dimension-independent adaptive Metropolis (DIAM) algorithm, also shows improved performance with respect to adaptive Metropolis on non-Gaussian targets. This algorithm is further improved, and the possibility of probing high-dimensional targets is enabled, via GPU-accelerated numerical libraries and periodically synchronized concurrent chains (justified a posteriori). Asymptoti- cally in dimension, this massively parallel dimension-independent adaptive Metropolis (MPDIAM) GPU implementation exhibits a factor of four improvement versus the CPU-based Intel MKL version alone, which is itself already a factor of three improve- ment versus the serial version. The scaling to multiple CPUs and GPUs exhibits a form of strong scaling in terms of the time necessary to reach a certain convergence criterion, through a combination of longer time per sample batch (weak scaling) and yet fewer necessary samples to convergence. This is illustrated by e ciently sampling from several Gaussian and non-Gaussian targets for dimension d 1000.
|
23 |
Auxiliary variable Markov chain Monte Carlo methodsGraham, Matthew McKenzie January 2018 (has links)
Markov chain Monte Carlo (MCMC) methods are a widely applicable class of algorithms for estimating integrals in statistical inference problems. A common approach in MCMC methods is to introduce additional auxiliary variables into the Markov chain state and perform transitions in the joint space of target and auxiliary variables. In this thesis we consider novel methods for using auxiliary variables within MCMC methods to allow approximate inference in otherwise intractable models and to improve sampling performance in models exhibiting challenging properties such as multimodality. We first consider the pseudo-marginal framework. This extends the Metropolis–Hastings algorithm to cases where we only have access to an unbiased estimator of the density of target distribution. The resulting chains can sometimes show ‘sticking’ behaviour where long series of proposed updates are rejected. Further the algorithms can be difficult to tune and it is not immediately clear how to generalise the approach to alternative transition operators. We show that if the auxiliary variables used in the density estimator are included in the chain state it is possible to use new transition operators such as those based on slice-sampling algorithms within a pseudo-marginal setting. This auxiliary pseudo-marginal approach leads to easier to tune methods and is often able to improve sampling efficiency over existing approaches. As a second contribution we consider inference in probabilistic models defined via a generative process with the probability density of the outputs of this process only implicitly defined. The approximate Bayesian computation (ABC) framework allows inference in such models when conditioning on the values of observed model variables by making the approximation that generated observed variables are ‘close’ rather than exactly equal to observed data. Although making the inference problem more tractable, the approximation error introduced in ABC methods can be difficult to quantify and standard algorithms tend to perform poorly when conditioning on high dimensional observations. This often requires further approximation by reducing the observations to lower dimensional summary statistics. We show how including all of the random variables used in generating model outputs as auxiliary variables in a Markov chain state can allow the use of more efficient and robust MCMC methods such as slice sampling and Hamiltonian Monte Carlo (HMC) within an ABC framework. In some cases this can allow inference when conditioning on the full set of observed values when standard ABC methods require reduction to lower dimensional summaries for tractability. Further we introduce a novel constrained HMC method for performing inference in a restricted class of differentiable generative models which allows conditioning the generated observed variables to be arbitrarily close to observed data while maintaining computational tractability. As a final topicwe consider the use of an auxiliary temperature variable in MCMC methods to improve exploration of multimodal target densities and allow estimation of normalising constants. Existing approaches such as simulated tempering and annealed importance sampling use temperature variables which take on only a discrete set of values. The performance of these methods can be sensitive to the number and spacing of the temperature values used, and the discrete nature of the temperature variable prevents the use of gradient-based methods such as HMC to update the temperature alongside the target variables. We introduce new MCMC methods which instead use a continuous temperature variable. This both removes the need to tune the choice of discrete temperature values and allows the temperature variable to be updated jointly with the target variables within a HMC method.
|
24 |
Programming language semantics as a foundation for Bayesian inferenceSzymczak, Marcin January 2018 (has links)
Bayesian modelling, in which our prior belief about the distribution on model parameters is updated by observed data, is a popular approach to statistical data analysis. However, writing specific inference algorithms for Bayesian models by hand is time-consuming and requires significant machine learning expertise. Probabilistic programming promises to make Bayesian modelling easier and more accessible by letting the user express a generative model as a short computer program (with random variables), leaving inference to the generic algorithm provided by the compiler of the given language. However, it is not easy to design a probabilistic programming language correctly and define the meaning of programs expressible in it. Moreover, the inference algorithms used by probabilistic programming systems usually lack formal correctness proofs and bugs have been found in some of them, which limits the confidence one can have in the results they return. In this work, we apply ideas from the areas of programming language theory and statistics to show that probabilistic programming can be a reliable tool for Bayesian inference. The first part of this dissertation concerns the design, semantics and type system of a new, substantially enhanced version of the Tabular language. Tabular is a schema-based probabilistic language, which means that instead of writing a full program, the user only has to annotate the columns of a schema with expressions generating corresponding values. By adopting this paradigm, Tabular aims to be user-friendly, but this unusual design also makes it harder to define the syntax and semantics correctly and reason about the language. We define the syntax of a version of Tabular extended with user-defined functions and pseudo-deterministic queries, design a dependent type system for this language and endow it with a precise semantics. We also extend Tabular with a concise formula notation for hierarchical linear regressions, define the type system of this extended language and show how to reduce it to pure Tabular. In the second part of this dissertation, we present the first correctness proof for a Metropolis-Hastings sampling algorithm for a higher-order probabilistic language. We define a measure-theoretic semantics of the language by means of an operationally-defined density function on program traces (sequences of random variables) and a map from traces to program outputs. We then show that the distribution of samples returned by our algorithm (a variant of “Trace MCMC” used by the Church language) matches the program semantics in the limit.
|
25 |
Algoritmos para o encaixe de moldes com formato irregular em tecidos listradosAlves, Andressa Schneider January 2016 (has links)
Esta tese tem como objetivo principal a proposição de solução para o problema do encaixe de moldes em tecidos listrados da indústria do vestuário. Os moldes são peças com formato irregular que devem ser dispostos sobre a matéria-prima, neste caso o tecido, para a etapa posterior de corte. No problema específico do encaixe em tecidos listrados, o local em que os moldes são posicionados no tecido deve garantir que, após a confecção da peça, as listras apresentem continuidade. Assim, a fundamentação teórica do trabalho abrange temas relacionados à moda e ao design do vestuário, como os tipos e padronagens de tecidos listrados, e as possibilidades de rotação e colocação dos moldes sobre tecidos listrados. Na fundamentação teórica também são abordados temas da pesquisa em otimização combinatória como: características dos problemas bidimensionais de corte e encaixe e algoritmos utilizados por diversos autores para solucionar o problema. Ainda na parte final da fundamentação teórica são descritos o método Cadeia de Markov Monte Carlo e o algoritmo de Metropolis-Hastings. Com base na pesquisa bibliográfica, foram propostos dois algoritmos distintos para lidar com o problema de encaixe de moldes em tecidos listrados: algoritmo com pré-processamento e algoritmo de busca do melhor encaixe utilizando o algoritmo de Metropolis-Hastings. Ambos foram implementados no software Riscare Listrado, que é uma continuidade do software Riscare para tecidos lisos desenvolvido em Alves (2010). Para testar o desempenho dos dois algoritmos foram utilizados seis problemas benchmarks da literatura e proposto um novo problema denominado de camisa masculina. Os problemas benchmarks da literatura foram propostos para matéria-prima lisa e o problema camisa masculina especificamente para tecidos listrados. Entre os dois algoritmos desenvolvidos, o algoritmo de busca do melhor encaixe apresentou resultados com melhores eficiências de utilização do tecido para todos os problemas propostos. Quando comparado aos melhores resultados publicados na literatura para matéria-prima lisa, o algoritmo de busca do melhor encaixe apresentou encaixes com eficiências inferiores, porém com resultados superiores ao recomendado pela literatura específica da área de moda para tecidos estampados. / This thesis proposes the solution for the packing problem of patterns on striped fabric in clothing industry. The patterns are pieces with irregular form that should be placed on raw material which is, in this case, the fabric. This fabric is cut after packing. In the specific problem of packing on striped fabric, the position that patterns are put in the fabric should ensure that, after the clothing sewing, the stripes should present continuity. Thus, the theoretical foundation of this project includes subjects about fashion and clothing design, such as types and rapports of striped fabric, and the possibilities of rotation and the correct place to put the patterns on striped fabric. In the theoretical foundation, there are also subjects about research in combinatorial optimization as: characteristics about bi-dimensional packing and cutting problems and algorithms used for several authors to solve the problem. In addition, the Markov Chain Monte Carlo method and the Metropolis-Hastings algorithm are described at end of theoretical foundation. Based on the bibliographic research, two different algorithms for the packing problem with striped fabric are proposed: algorithm with pre-processing step and algorithm of searching the best packing using the Metropolis-Hastings algorithm. Both algorithms are implemented in the Striped Riscare software, which is a continuity of Riscare software for clear fabrics developed in the Masters degree of the author. Both algorithms performances are tested with six literature benchmark problems and a new problem called “male shirt” is proposed here. The benchmark problems of literature were iniatially proposed for clear raw material and the male shirt problem, specifically for striped fabrics. Between the two developed algorithms, the algorithm of searching the best packing has shown better results with better efficiencies of the fabric usage for all the problems tested. When compared to the best results published in the literature for clear raw material, the algorithm of searching the best packing has shown packings with lower efficiencies. However, it showed results higher than recommended for the specific literature of fashion design for patterned fabrics.
|
26 |
Algoritmos para o encaixe de moldes com formato irregular em tecidos listradosAlves, Andressa Schneider January 2016 (has links)
Esta tese tem como objetivo principal a proposição de solução para o problema do encaixe de moldes em tecidos listrados da indústria do vestuário. Os moldes são peças com formato irregular que devem ser dispostos sobre a matéria-prima, neste caso o tecido, para a etapa posterior de corte. No problema específico do encaixe em tecidos listrados, o local em que os moldes são posicionados no tecido deve garantir que, após a confecção da peça, as listras apresentem continuidade. Assim, a fundamentação teórica do trabalho abrange temas relacionados à moda e ao design do vestuário, como os tipos e padronagens de tecidos listrados, e as possibilidades de rotação e colocação dos moldes sobre tecidos listrados. Na fundamentação teórica também são abordados temas da pesquisa em otimização combinatória como: características dos problemas bidimensionais de corte e encaixe e algoritmos utilizados por diversos autores para solucionar o problema. Ainda na parte final da fundamentação teórica são descritos o método Cadeia de Markov Monte Carlo e o algoritmo de Metropolis-Hastings. Com base na pesquisa bibliográfica, foram propostos dois algoritmos distintos para lidar com o problema de encaixe de moldes em tecidos listrados: algoritmo com pré-processamento e algoritmo de busca do melhor encaixe utilizando o algoritmo de Metropolis-Hastings. Ambos foram implementados no software Riscare Listrado, que é uma continuidade do software Riscare para tecidos lisos desenvolvido em Alves (2010). Para testar o desempenho dos dois algoritmos foram utilizados seis problemas benchmarks da literatura e proposto um novo problema denominado de camisa masculina. Os problemas benchmarks da literatura foram propostos para matéria-prima lisa e o problema camisa masculina especificamente para tecidos listrados. Entre os dois algoritmos desenvolvidos, o algoritmo de busca do melhor encaixe apresentou resultados com melhores eficiências de utilização do tecido para todos os problemas propostos. Quando comparado aos melhores resultados publicados na literatura para matéria-prima lisa, o algoritmo de busca do melhor encaixe apresentou encaixes com eficiências inferiores, porém com resultados superiores ao recomendado pela literatura específica da área de moda para tecidos estampados. / This thesis proposes the solution for the packing problem of patterns on striped fabric in clothing industry. The patterns are pieces with irregular form that should be placed on raw material which is, in this case, the fabric. This fabric is cut after packing. In the specific problem of packing on striped fabric, the position that patterns are put in the fabric should ensure that, after the clothing sewing, the stripes should present continuity. Thus, the theoretical foundation of this project includes subjects about fashion and clothing design, such as types and rapports of striped fabric, and the possibilities of rotation and the correct place to put the patterns on striped fabric. In the theoretical foundation, there are also subjects about research in combinatorial optimization as: characteristics about bi-dimensional packing and cutting problems and algorithms used for several authors to solve the problem. In addition, the Markov Chain Monte Carlo method and the Metropolis-Hastings algorithm are described at end of theoretical foundation. Based on the bibliographic research, two different algorithms for the packing problem with striped fabric are proposed: algorithm with pre-processing step and algorithm of searching the best packing using the Metropolis-Hastings algorithm. Both algorithms are implemented in the Striped Riscare software, which is a continuity of Riscare software for clear fabrics developed in the Masters degree of the author. Both algorithms performances are tested with six literature benchmark problems and a new problem called “male shirt” is proposed here. The benchmark problems of literature were iniatially proposed for clear raw material and the male shirt problem, specifically for striped fabrics. Between the two developed algorithms, the algorithm of searching the best packing has shown better results with better efficiencies of the fabric usage for all the problems tested. When compared to the best results published in the literature for clear raw material, the algorithm of searching the best packing has shown packings with lower efficiencies. However, it showed results higher than recommended for the specific literature of fashion design for patterned fabrics.
|
27 |
Klasifikace bakterií do taxonomických kategorií na základě vlastností 16s rRNA / Bacteria Classification into Taxonomic Categories Based on Properties of 16s rRNAGrešová, Katarína January 2020 (has links)
The main goal of this thesis was to design and implement a tool that would be able to classify the sequences of the 16S rRNA gene into taxonomic categories using the properties of the 16S rRNA gene. The created tool analyzes all input sequences simultaneously, which differs from common classification approaches, which classify input sequences individually. This tool relies on the fact that bacteria contain several copies of the 16S rRNA gene, which may differ in sequence. The main contribution of this work is design, implementation and evaluation of the capabilities of this tool. Experiments have shown that the proposed tool is able to identify the corresponding bacteria for smaller datasets and determine the correct ratios of their abundances. However, with larger datasets, the state space becomes very large and fragmented, which requires further improvements in order for it to search the state space in an efficient way.
|
28 |
Importance Sampling of Rare Events in Chaotic SystemsLeitão, Jorge C. 19 August 2016 (has links)
Rare events play a crucial role in our society and a great effort has been dedicated to numerically study them in different contexts. This thesis proposes a numerical methodology based on Monte Carlo Metropolis-Hastings algorithm to efficiently sample rare events in chaotic systems. It starts by reviewing the relevance of rare events in chaotic systems, focusing in two types of rare events: states in closed systems with rare chaoticities, characterised by a finite-time Lyapunov exponent on a tail of its distribution, and states in transiently chaotic systems, characterised by a escape time on the tail of its distribution.
This thesis argues that these two problems can be interpreted as a traditional problem of statistical physics: sampling exponentially rare states in the phase-space - states in the tail of the density of states - with an increasing parameter - the system size. This is used as the starting point to review Metropolis-Hastings algorithm, a traditional and flexible methodology of importance sampling in statistical physics. By an analytical argument, it is shown that the chaoticity of the system hinders direct application of Metropolis-Hastings techniques to efficiently sample these states because the acceptance is low. It is argued that a crucial step to overcome low acceptance rate is to construct a proposal distribution that uses information about the system to bound the acceptance rate. Using generic properties of chaotic systems, such as exponential divergence of initial conditions and fractals embedded in their phase-spaces, a proposal distribution that guarantees a bounded acceptance rate is derived for each type of rare events. This proposal is numerically tested in simple chaotic systems, and the efficiency of the resulting algorithm is measured in numerous examples in both types of rare events.
The results confirm the dramatic improvement of using Monte Carlo importance sampling with the derived proposals against traditional methodologies:
the number of samples required to sample an exponentially rare state increases polynomially, as opposed to an exponential increase observed in uniform sampling. This thesis then analyses the sub-optimal (polynomial) efficiency of this algorithm in a simple system and shows analytically how the correlations induced by the proposal distribution can be detrimental to the efficiency of the algorithm. This thesis also analyses the effect of high-dimensional chaos in the proposal distribution and concludes that an anisotropic proposal that takes advantage of the different rates of expansion along the different unstable directions, is able to efficiently find rare states.
The applicability of this methodology is also discussed to sample rare states in non-hyperbolic systems, with focus on three systems: the logistic map, the Pomeau-Manneville map, and the standard map. Here, it is argued that the different origins of non-hyperbolicity require different proposal distributions. Overall, the results show that by incorporating specific information about the system in the proposal distribution of Metropolis-Hastings algorithm, it is possible to efficiently find and sample rare events of chaotic systems. This improved methodology should be useful to a large class of problems where the numerical characterisation of rare events is important.
|
29 |
An Adaptive Bayesian Approach to Bernoulli-Response Clinical TrialsStacey, Andrew W. 06 August 2007 (has links) (PDF)
Traditional clinical trials have been inefficient in their methods of dose finding and dose allocation. In this paper a four-parameter logistic equation is used to model the outcome of Bernoulli-response clinical trials. A Bayesian adaptive design is used to fit the logistic equation to the dose-response curve of Phase II and Phase III clinical trials. Because of inherent restrictions in the logistic model, symmetric candidate densities cannot be used, thereby creating asymmetric jumping rules inside the Markov chain Monte Carlo algorithm. An order restricted Metropolis-Hastings algorithm is implemented to account for these limitations. Modeling clinical trials in a Bayesian framework allows the experiment to be adaptive. In this adaptive design batches of subjects are assigned to doses based on the posterior probability of success for each dose, thereby increasing the probability of receiving advantageous doses. Good posterior fitting is demonstrated for typical dose-response curves and the Bayesian design is shown to properly stop drug trials for clinical futility or clinical success. In this paper we demonstrate that an adaptive Bayesian approach to dose-response studies increases both the statistical and medicinal effectiveness of clinical research.
|
30 |
A Bayesian Approach to Missile ReliabilityRedd, Taylor Hardison 01 June 2011 (has links) (PDF)
Each year, billions of dollars are spent on missiles and munitions by the United States government. It is therefore vital to have a dependable method to estimate the reliability of these missiles. It is important to take into account the age of the missile, the reliability of different components of the missile, and the impact of different launch phases on missile reliability. Additionally, it is of importance to estimate the missile performance under a variety of test conditions, or modalities. Bayesian logistic regression is utilized to accurately make these estimates. This project presents both previously proposed methods and ways to combine these methods to accurately estimate the reliability of the Cruise Missile.
|
Page generated in 0.1196 seconds