Global ETD Search

1	Tests of Independence in a Single 2x2 Contingency Table with Random Margins Yu, Yuan 01 May 2014 (has links) In analysis of the contingency tables, the Fisher's exact test is a very important statistical significant test that is commonly used to test independence between the two variables. However, the Fisher' s exact test is based upon the assumption of the fixed margins. That is, the Fisher's exact test uses information beyond the table so that it is conservative. To solve this problem, we allow the margins to be random. This means that instead of fitting the count data to the hypergeometric distribution as in the Fisher's exact test, we model the margins and one cell using multinomial distribution, and then we use the likelihood ratio to test the hypothesis of independence. Furthermore, using Bayesian inference, we consider the Bayes factor as another test statistic. In order to judge the test performance, we compare the power of the likelihood ratio test, the Bayes factor test and the Fisher's exact test. In addition, we use our methodology to analyse data gathered from the Worcester Heart Attack Study to assess gender difference in the therapeutic management of patients with acute myocardial infarction (AMI) by selected demographic and clinical characteristics. likelihood ratio test Bayes factor
2	A Bayesian Test of Independence for Two-way Contingency Tables Under Cluster Sampling Bhatta, Dilli 19 April 2013 (has links) We consider a Bayesian approach to the study of independence in a two-way contingency table obtained from a two-stage cluster sampling design. We study the association between two categorical variables when (a) there are no covariates and (b) there are covariates at both unit and cluster levels. Our main idea for the Bayesian test of independence is to convert the cluster sample into an equivalent simple random sample which provides a surrogate of the original sample. Then, this surrogate sample is used to compute the Bayes factor to make an inference about independence. For the test of independence without covariates, the Rao-Scott corrections to the standard chi-squared (or likelihood ratio) statistic were developed. They are ``large sample' methods and provide appropriate inference when there are large cell counts. However, they are less successful when there are small cell counts. We have developed the methodology to overcome the limitations of Rao-Scott correction. We have used a hierarchical Bayesian model to convert the observed cluster samples to simple random samples. This provides the surrogate samples which can be used to derive the distribution of the Bayes factor to make an inference about independence. We have used a sampling-based method to fit the model. For the test of independence with covariates, we first convert the cluster sample with covariates to a cluster sample without covariates. We use multinomial logistic regression model with random effects to accommodate the cluster effects. Our idea is to fit the cluster samples to the random effect models and predict the new samples by adjusting with the covariates. This provides the cluster sample without covariates. We then use a hierarchical Bayesian model to convert this cluster sample to a simple random sample which allows us to calculate the Bayes factor to make an inference about independence. We use Markov chain Monte Carlo methods to fit our models. We apply our first method to the Third International Mathematics and Science Study (1995) for third grade U.S. students in which we study the association between the mathematics test scores and the communities the students come from, and science test scores and the communities the students come from. We also provide a simulation study which establishes our methodology as a viable alternative to the Rao-Scott approximations for relatively small two-stage cluster samples. We apply our second method to the data from the Trend in International Mathematics and Science Study (2007) for fourth grade U.S. students to assess the association between the mathematics and science scores represented as categorical variables and also provide the simulation study. The result shows that if there is strong association between two categorical variables, there is no difference between the significance of the test in using the model (a) with covariates and (b) without covariates. However, in simulation studies, there is a noticeable difference in the significance of the test between the two models when there are borderline cases (i.e., situations where there is marginal significance). Surrogate samples Bayes factor Hierarchical Baye
3	Bayesian Model Checking in Multivariate Discrete Regression Problems Dong, Fanglong 03 November 2008 (has links) No description available. Statistics Bayesian statistics ordinal data bayes factor deviance posterior distribution
4	Uma abordagem bayesiana para mapeamento de QTLs em populações experimentais / A Bayesian approach for mapping QTL in experimental populations Meyer, Andréia da Silva 03 April 2009 (has links) Muitos caracteres em plantas e animais são de natureza quantitativa, influenciados por múltiplos genes. Com o advento de novas técnicas moleculares tem sido possível mapear os locos que controlam os caracteres quantitativos, denominados QTLs (Quantitative Trait Loci). Mapear um QTL significa identificar sua posição no genoma, bem como, estimar seus efeitos genéticos. A maior dificuldade para realizar o mapeamento de QTLs, se deve ao fato de que o número de QTLs é desconhecido. Métodos bayesianos juntamente com método Monte Carlo com Cadeias de Markov (MCMC), têm sido implementados para inferir conjuntamente o número de QTLs, suas posições no genoma e os efeitos genéticos . O desafio está em obter a amostra da distribuição conjunta a posteriori desses parâmetros, uma vez que o número de QTLs pode ser considerado desconhecido e a dimensão do espaço paramétrico muda de acordo com o número de QTLs presente no modelo. No presente trabalho foi implementado, utilizando-se o programa estatístico R uma abordagem bayesiana para mapear QTLs em que múltiplos QTLs e os efeitos de epistasia são considerados no modelo. Para tanto foram ajustados modelos com números crescentes de QTLs e o fator de Bayes foi utilizado para selecionar o modelo mais adequado e conseqüentemente, estimar o número de QTLs que controlam os fenótipos de interesse. Para investigar a eficiência da metodologia implementada foi feito um estudo de simulação em que foram considerados duas diferentes populações experimentais: retrocruzamento e F2, sendo que para ambas as populações foi feito o estudo de simulação considerando modelos com e sem epistasia. A abordagem implementada mostrou-se muito eficiente, sendo que para todas as situações consideradas o modelo selecionado foi o modelo contendo o número verdadeiro de QTLs considerado na simulação dos dados. Além disso, foi feito o mapeamento de QTLs de três fenótipos de milho tropical: altura da planta (AP), altura da espiga (AE) e produção de grãos utilizando a metodologia implementada e os resultados obtidos foram comparados com os resultados encontrados pelo método CIM. / Many traits in plants and animals have quantitative nature, influenced by multiple genes. With the new molecular techniques, it has been possible to map the loci, which control the quantitative traits, called QTL (Quantitative Trait Loci). Mapping a QTL means to identify its position in the genome, as well as to estimate its genetics effects. The great difficulty of mapping QTL relates to the fact that the number of QTL is unknown. Bayesian approaches used with Markov Chain Monte Carlo method (MCMC) have been applied to infer QTL number, their positions in the genome and their genetic effects. The challenge is to obtain the sample from the joined distribution posterior of these parameters, since the number of QTL may be considered unknown and hence the dimension of the parametric space changes according to the number of QTL in the model. In this study, a Bayesian approach was applied, using the statistical program R, in order to map QTL, considering multiples QTL and epistasis effects in the model. Models were adjusted with the crescent number of QTL and Bayes factor was used to select the most suitable model and, consequently, to estimate the number of QTL that control interesting phenotype. To evaluate the efficiency of the applied methodology, a simulation study was done, considering two different experimental populations: backcross and F2, accomplishing the simulation study for both populations, considering models with and without epistasis. The applied approach resulted to be very efficient, considering that for all the used situations, the selected model was the one containing the real number of QTL used in the data simulation. Moreover, the QTL mapping of three phenotypes of tropical corn was done: plant height, corn-cob height and grain production, using the applied methodology and the results were compared to the results found by the CIM method. Bayes factor Bayesian inference Genética estatística Inferência baysiana Mapeamento genético MCMC Método de Monte Carlo. QTL mapping.
5	Bayesian Nonparametric Models for Multi-Stage Sample Surveys Yin, Jiani 27 April 2016 (has links) It is a standard practice in small area estimation (SAE) to use a model-based approach to borrow information from neighboring areas or from areas with similar characteristics. However, survey data tend to have gaps, ties and outliers, and parametric models may be problematic because statistical inference is sensitive to parametric assumptions. We propose nonparametric hierarchical Bayesian models for multi-stage finite population sampling to robustify the inference and allow for heterogeneity, outliers, skewness, etc. Bayesian predictive inference for SAE is studied by embedding a parametric model in a nonparametric model. The Dirichlet process (DP) has attractive properties such as clustering that permits borrowing information. We exemplify by considering in detail two-stage and three-stage hierarchical Bayesian models with DPs at various stages. The computational difficulties of the predictive inference when the population size is much larger than the sample size can be overcome by the stick-breaking algorithm and approximate methods. Moreover, the model comparison is conducted by computing log pseudo marginal likelihood and Bayes factors. We illustrate the methodology using body mass index (BMI) data from the National Health and Nutrition Examination Survey and simulated data. We conclude that a nonparametric model should be used unless there is a strong belief in the specific parametric form of a model. Bayes factor sample survey small area estimation robustness posterior propriety nonparametric procedure Dirichlet process
6	Uma abordagem bayesiana para mapeamento de QTLs em populações experimentais / A Bayesian approach for mapping QTL in experimental populations Andréia da Silva Meyer 03 April 2009 (has links) Muitos caracteres em plantas e animais são de natureza quantitativa, influenciados por múltiplos genes. Com o advento de novas técnicas moleculares tem sido possível mapear os locos que controlam os caracteres quantitativos, denominados QTLs (Quantitative Trait Loci). Mapear um QTL significa identificar sua posição no genoma, bem como, estimar seus efeitos genéticos. A maior dificuldade para realizar o mapeamento de QTLs, se deve ao fato de que o número de QTLs é desconhecido. Métodos bayesianos juntamente com método Monte Carlo com Cadeias de Markov (MCMC), têm sido implementados para inferir conjuntamente o número de QTLs, suas posições no genoma e os efeitos genéticos . O desafio está em obter a amostra da distribuição conjunta a posteriori desses parâmetros, uma vez que o número de QTLs pode ser considerado desconhecido e a dimensão do espaço paramétrico muda de acordo com o número de QTLs presente no modelo. No presente trabalho foi implementado, utilizando-se o programa estatístico R uma abordagem bayesiana para mapear QTLs em que múltiplos QTLs e os efeitos de epistasia são considerados no modelo. Para tanto foram ajustados modelos com números crescentes de QTLs e o fator de Bayes foi utilizado para selecionar o modelo mais adequado e conseqüentemente, estimar o número de QTLs que controlam os fenótipos de interesse. Para investigar a eficiência da metodologia implementada foi feito um estudo de simulação em que foram considerados duas diferentes populações experimentais: retrocruzamento e F2, sendo que para ambas as populações foi feito o estudo de simulação considerando modelos com e sem epistasia. A abordagem implementada mostrou-se muito eficiente, sendo que para todas as situações consideradas o modelo selecionado foi o modelo contendo o número verdadeiro de QTLs considerado na simulação dos dados. Além disso, foi feito o mapeamento de QTLs de três fenótipos de milho tropical: altura da planta (AP), altura da espiga (AE) e produção de grãos utilizando a metodologia implementada e os resultados obtidos foram comparados com os resultados encontrados pelo método CIM. / Many traits in plants and animals have quantitative nature, influenced by multiple genes. With the new molecular techniques, it has been possible to map the loci, which control the quantitative traits, called QTL (Quantitative Trait Loci). Mapping a QTL means to identify its position in the genome, as well as to estimate its genetics effects. The great difficulty of mapping QTL relates to the fact that the number of QTL is unknown. Bayesian approaches used with Markov Chain Monte Carlo method (MCMC) have been applied to infer QTL number, their positions in the genome and their genetic effects. The challenge is to obtain the sample from the joined distribution posterior of these parameters, since the number of QTL may be considered unknown and hence the dimension of the parametric space changes according to the number of QTL in the model. In this study, a Bayesian approach was applied, using the statistical program R, in order to map QTL, considering multiples QTL and epistasis effects in the model. Models were adjusted with the crescent number of QTL and Bayes factor was used to select the most suitable model and, consequently, to estimate the number of QTL that control interesting phenotype. To evaluate the efficiency of the applied methodology, a simulation study was done, considering two different experimental populations: backcross and F2, accomplishing the simulation study for both populations, considering models with and without epistasis. The applied approach resulted to be very efficient, considering that for all the used situations, the selected model was the one containing the real number of QTL used in the data simulation. Moreover, the QTL mapping of three phenotypes of tropical corn was done: plant height, corn-cob height and grain production, using the applied methodology and the results were compared to the results found by the CIM method. Genética estatística Inferência baysiana Mapeamento genético Método de Monte Carlo. Bayes factor Bayesian inference MCMC QTL mapping.
7	Bayesian compartmental models for zoonotic visceral leishmaniasis in the Americas Ozanne, Marie Veronica 01 May 2019 (has links) Visceral leishmaniasis (VL) is a serious neglected tropical disease that is endemic in 98 countries and presents a significant public health risk. The epidemiology of VL is complex. In the Americas, it is a zoonotic disease that is caused by a parasite and transmitted among humans and dogs through the bite of an infected sand fly vector. The infection also can be transmitted vertically from mother to child during pregnancy. Infected individuals can be classified as asymptomatic or symptomatic; both classes can transmit infection. In part due to its complexity, VL transmission dynamics are not fully understood. Stochastic compartmental epidemic models are a powerful set of tools that can be used to study these transmission dynamics. Past compartmental models for VL have been developed in a deterministic framework to accommodate complexity while remaining computationally tractable. In this work, we propose stochastic compartmental models for VL, which are simpler than their deterministic counterparts, but also have several advantages. Notably, this framework allows us to: (1) define a probability of infection transmission between two individuals, (2) obtain both parameter estimates and corresponding uncertainty measures, and (3) employ formal model comparisons. In this dissertation, we develop both population level and individual level Bayesian compartmental models to study both vector and vertical VL transmission dynamics. As part of this model development, we introduce a compartmental model that allows for two infectious classes. We also derive source specific reproductive numbers to quantify the contributions of different species and infectious classes to maintaining infection in a population. Finally, we propose a formal model comparison method for Bayesian models with high-dimensional discrete parameter spaces. These models, reproductive numbers, and model comparison method are explored in the context of simulations and real VL data from Brazil and the United States. Bayes factor empirically-adjusted reproductive number SAYVR SIR vector transmission vertical transmission Biostatistics
8	Contributions to Bayesian wavelet shrinkage Remenyi, Norbert 07 November 2012 (has links) This thesis provides contributions to research in Bayesian modeling and shrinkage in the wavelet domain. Wavelets are a powerful tool to describe phenomena rapidly changing in time, and wavelet-based modeling has become a standard technique in many areas of statistics, and more broadly, in sciences and engineering. Bayesian modeling and estimation in the wavelet domain have found useful applications in nonparametric regression, image denoising, and many other areas. In this thesis, we build on the existing techniques and propose new methods for applications in nonparametric regression, image denoising, and partially linear models. The thesis consists of an overview chapter and four main topics. In Chapter 1, we provide an overview of recent developments and the current status of Bayesian wavelet shrinkage research. The chapter contains an extensive literature review consisting of almost 100 references. The main focus of the overview chapter is on nonparametric regression, where the observations come from an unknown function contaminated with Gaussian noise. We present many methods which employ model-based and adaptive shrinkage of the wavelet coefficients through Bayes rules. These includes new developments such as dependence models, complex wavelets, and Markov chain Monte Carlo (MCMC) strategies. Some applications of Bayesian wavelet shrinkage, such as curve classification, are discussed. In Chapter 2, we propose the Gibbs Sampling Wavelet Smoother (GSWS), an adaptive wavelet denoising methodology. We use the traditional mixture prior on the wavelet coefficients, but also formulate a fully Bayesian hierarchical model in the wavelet domain accounting for the uncertainty of the prior parameters by placing hyperpriors on them. Since a closed-form solution to the Bayes estimator does not exist, the procedure is computational, in which the posterior mean is computed via MCMC simulations. We show how to efficiently develop a Gibbs sampling algorithm for the proposed model. The developed procedure is fully Bayesian, is adaptive to the underlying signal, and provides good denoising performance compared to state-of-the-art methods. Application of the method is illustrated on a real data set arising from the analysis of metabolic pathways, where an iterative shrinkage procedure is developed to preserve the mass balance of the metabolites in the system. We also show how the methodology can be extended to complex wavelet bases. In Chapter 3, we propose a wavelet-based denoising methodology based on a Bayesian hierarchical model using a double Weibull prior. The interesting feature is that in contrast to the mixture priors traditionally used by some state-of-the-art methods, the wavelet coefficients are modeled by a single density. Two estimators are developed, one based on the posterior mean and the other based on the larger posterior mode; and we show how to calculate these estimators efficiently. The methodology provides good denoising performance, comparable even to state-of-the-art methods that use a mixture prior and an empirical Bayes setting of hyperparameters; this is demonstrated by simulations on standard test functions. An application to a real-word data set is also considered. In Chapter 4, we propose a wavelet shrinkage method based on a neighborhood of wavelet coefficients, which includes two neighboring coefficients and a parental coefficient. The methodology is called Lambda-neighborhood wavelet shrinkage, motivated by the shape of the considered neighborhood. We propose a Bayesian hierarchical model using a contaminated exponential prior on the total mean energy in the Lambda-neighborhood. The hyperparameters in the model are estimated by the empirical Bayes method, and the posterior mean, median, and Bayes factor are obtained and used in the estimation of the total mean energy. Shrinkage of the neighboring coefficients is based on the ratio of the estimated and observed energy. The proposed methodology is comparable and often superior to several established wavelet denoising methods that utilize neighboring information, which is demonstrated by extensive simulations. An application to a real-world data set from inductance plethysmography is considered, and an extension to image denoising is discussed. In Chapter 5, we propose a wavelet-based methodology for estimation and variable selection in partially linear models. The inference is conducted in the wavelet domain, which provides a sparse and localized decomposition appropriate for nonparametric components with various degrees of smoothness. A hierarchical Bayes model is formulated on the parameters of this representation, where the estimation and variable selection is performed by a Gibbs sampling procedure. For both the parametric and nonparametric part of the model we are using point-mass-at-zero contamination priors with a double exponential spread distribution. In this sense we extend the model of Chapter 2 to partially linear models. Only a few papers in the area of partially linear wavelet models exist, and we show that the proposed methodology is often superior to the existing methods with respect to the task of estimating model parameters. Moreover, the method is able to perform Bayesian variable selection by a stochastic search for the parametric part of the model. Bayes factor Bayesian estimation Bayesian infere Wavelets (Mathematics) Bayesian statistical decision theory Mathematical statistics
9	Summarizing FLARE assay images in colon carcinogenesis Leyk Williams, Malgorzata 12 April 2006 (has links) Intestinal tract cancer is one of the more common cancers in the United States. While in some individuals a genetic component causes the cancer, the rate of cancer in the remainder of the population is believed to be affected by diet. Since cancer usually develops slowly, the amount of oxidative damage to DNA can be used as a cancer biomarker. This dissertation examines effective ways of analyzing FLARE assay data, which quantiﬁes oxidative damage. The statistical methods will be implemented on data from a FLARE assay experiment, which examines cells from the duodenum and the colon to see if there is a difference in the risk of cancer due to corn or ﬁsh oil diets. Treatments of the oxidizing agent dextran sodium sulfate (DSS), DSS with a recovery period, as well as a control will also be used. Previous methods presented in the literature examined the FLARE data by summarizing the DNA damage of each cell with a single number, such as the relative tail moment (RTM). Variable skewness is proposed as an alternative measure, and shown to be as effective as the RTM in detecting diet and treatment differences in the standard analysis. The RTM and skewness data is then analyzed using a hierarchical model, with both the skewness and RTM showing diet/treatment differences. Simulated data for this model is also considered, and shows that a Bayes Factor (BF) for higher dimensional models does not follow guidelines presented by Kass and Raftery (1995). It is hypothesized that more information is obtained by describing the DNA damage functions, instead of summarizing them with a single number. From each function, seven points are picked. First, they are modeled independently, and only diet effects are found. However, when the correlation between points at the cell and rat level is modeled, much stronger diet and treatment differences are shown both in the colon and the duodenum than for any of the previous methods. These results are also easier to interpret and represent graphically, showing that the latter is an effective method of analyzing the FLARE data. FLARE assay hierarchical models Bayes Factor comet assay corn oil fish oil
10	Bayesian Model Checking Strategies for Dichotomous Item Response Theory Models Toribio, Sherwin G. 16 June 2006 (has links) No description available. Item Response Theory Bayesian Model Checking Gibbs Sampling Predictive Distributions Bayes Factor

Search results