Global ETD Search

1	Inference for Populations: Uncertainty Propagation via Bayesian Population Synthesis Grubb, Christopher Thomas 16 August 2023 (has links) In this dissertation, we develop a new type of prior distribution, specifically for populations themselves, which we denote the Dirichlet Spacing prior. This prior solves a specific problem that arises when attempting to create synthetic populations from a known subset: the unfortunate reality that assuming independence between population members means that every synthetic population will be essentially the same. This is a problem because any model which only yields one result (several very similar results), when we have very incomplete information, is fundamentally flawed. We motivate our need for this new class of priors using Agent-based Models, though this prior could be used in any situation requiring synthetic populations. / Doctor of Philosophy / Typically, statisticians work with parametric distributions governing independent observations. However, sometimes operating under the assumption of independence severely limits us. We motivate the move away from independent sampling via the scope of Agent-based Modeling (ABM), where full populations are needed. The assumption of independence, when applied to synthesizing populations, leads to unwanted results; specifically, all synthetic populations generated from the sample sample data are essentially the same. As statisticians, this is clearly problematic because given only a small subset of the population, we clearly do not know what the population looks like, and thus any model which always gives the same answer is fundamentally flawed. We fix this problem by utilizing a new class of distributions which we call spacing priors, which allow us to create synthetic populations of individuals which are not independent of each other. Bayesian Statistics Synthetic Populations
2	Bayes linear covariance matrix adjustment Wilkinson, Darren James January 1995 (has links) In this thesis, a Bayes linear methodology for the adjustment of covariance matrices is presented and discussed. A geometric framework for quantifying uncertainties about covariance matrices is set up, and an inner-product for spaces of random matrices is motivated and constructed. The inner-product on this space captures aspects of belief about the relationships between covariance matrices of interest, providing a structure rich enough to adjust beliefs about unknown matrices in the light of data such as sample covariance matrices, exploiting second-order exchangeability and related specifications to obtain representations allowing analysis. Adjustment is associated with orthogonal projection, and illustrated by examples for some common problems. The difficulties of adjusting the covariance matrices underlying exchangeable random vectors is tackled and discussed. Learning about the covariance matrices associated with multivariate time series dynamic linear models is shown to be amenable to a similar approach. Diagnostics for matrix adjustments are also discussed. 519.5
3	RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data He, Yuting 29 April 2014 (has links) Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Unlike research cell lines, clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. Result: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele frequency. We apply our model and identify twelve mutations in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are loss-of-heterozygosity events. variant detection Bayesian statistics graphical
4	Statistical methods for neuroimaging data analysis and cognitive science Song, Yin 29 May 2019 (has links) This thesis presents research focused on developing statistical methods with emphasis on tools that can be used for the analysis of data in neuroimaging studies and cognitive science. The first contribution addresses the problem of determining the location and dynamics of brain activity when electromagnetic signals are collected using magnetoencephalography (MEG) and electroencephalography (EEG). We formulate a new spatiotemporal model that jointly models MEG and EEG data as a function of unobserved neuronal activation. To fit this model we derive an efficient procedure for simultaneous point estimation and model selection based on the iterated conditional modes algorithm combined with local polynomial smoothing. The methodology is evaluated through extensive simulation studies and an application examining the visual response to scrambled faces. In the second contribution we develop a Bayesian spatial model for imaging genetics developed for analyses examining the influence of genetics on brain structure as measured by MRI. We extend the recently developed regression model of Greenlaw et al. (\textit{Bioinformatics}, 2017) to accommodate more realistic correlation structures typically seen in structural brain imaging data. We allow for spatial correlation in the imaging phenotypes obtained from neighbouring regions in the same hemisphere of the brain and we also allow for correlation in the same phenotypes obtained from different hemispheres (left/right) of the brain. This correlation structure is incorporated through the use of a bivariate conditional autoregressive spatial model. Both Markov chain Monte Carlo (MCMC) and variational Bayes approaches are developed to approximate the posterior distribution and Bayesian false discovery rate (FDR) procedures are developed to select SNPs using the posterior distribution while accounting for multiplicity. The methodology is evaluated through an analysis of MRI and genetic data obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and we show that the new spatial model exhibits improved performance on real data when compared to the non-spatial model of Greenlaw et al. (2017). In the third and final contribution we develop and investigate tools for the analysis of binary data arising from repeated measures designs. We propose a Bayesian approach for the mixed-effects analysis of accuracy studies using mixed binomial regression models and we investigate techniques for model selection. / Graduate Bayesian statistics Neuroimaging Imaging genetics Cognitive science
5	Bayesian Information Fusion for Precision Indoor Location Cavanaugh, Andrew F 07 February 2011 (has links) This thesis documents work which is part of the ongoing effort by the Worcester Polytechnic Institute (WPI) Precision Personnel Locator (PPL) project, to track and locate first responders in urban/indoor settings. Specifically, the project intends to produce a system which can accurately determine the floor that a person is on, as well as where on the floor that person is, with sub-meter accuracy. The system must be portable, rugged, fast to set up, and require no pre-installed infrastructure. Several recent advances have enabled us to get closer to meeting these goals: The development of Transactional Array Reconciliation Tomography(TART) algorithm, and corresponding locator hardware, as well as the integration of barometric sensors, and a new antenna deployment scheme. To fully utilize these new capabilities, a Bayesian Fusion algorithm has been designed. The goal of this thesis is to present the necessary methods for incorporating diverse sources of information, in a constructive manner, to improve the performance of the PPL system. While the conceptual methods presented within are meant to be general, the experimental results will focus on the fusion of barometric height estimates and RF data. These information sources will be processed with our existing Singular Value Array Reconciliation Tomography (σART), and the new TART algorithm, using a Bayesian Fusion algorithm to more accurately estimate indoor locations. signal processing location tracking bayesian statistics
6	Bayesian Logistic Regression with Spatial Correlation: An Application to Tennessee River Pollution Marjerison, William M 15 December 2006 (has links) "We analyze data (length, weight and location) from a study done by the Army Corps of Engineers along the Tennessee River basin in the summer of 1980. The purpose is to predict the probability that a hypothetical channel catfish at a location studied is toxic and contains 5 ppm or more DDT in its filet. We incorporate spatial information and treate it separetely from other covariates. Ultimately, we want to predict the probability that a catfish from the unobserved location is toxic. In a preliminary analysis, we examine the data for observed locations using frequentist logistic regression, Bayesian logistic regression, and Bayesian logistic regression with random effects. Later we develop a parsimonious extension of Bayesian logistic regression and the corresponding Gibbs sampler for that model to increase computational feasibility and reduce model parameters. Furthermore, we develop a Bayesian model to impute data for locations where catfish were not observed. A comparison is made between results obtained fitting the model to only observed data and data with missing values imputed. Lastly, a complete model is presented which imputes data for missing locations and calculates the probability that a catfish from the unobserved location is toxic at once. We conclude that length and weight of the fish have negligible effect on toxicity. Toxicity of these catfish are mostly explained by location and spatial effects. In particular, the probability that a catfish is toxic decreases as one moves further downstream from the source of pollution." logistic regression Bayesian statistics MCMC spatial statistics
7	An investigation of a Bayesian decision-theoretic procedure in the context of mastery tests Hsieh, Ming-Chuan 01 January 2007 (has links) The purpose of this study was to extend Glas and Vos's (1998) Bayesian procedure to the 3PL IRT model by using the MCMC method. In the context of fixed-length mastery tests, the Bayesian decision-theoretic procedure was compared with two conventional procedures (conventional- Proportion Correct and conventional- EAP) across different simulation conditions. Several simulation conditions were investigated, including two loss functions (linear and threshold loss function), three item pools (high discrimination, moderate discrimination and real item pool) and three test lengths (20, 40 and 60). Different loss parameters were manipulated in the Bayesian decision-theoretic procedure to examine the effectiveness of controlling false positive and false negative errors. The degree of decision accuracy for the Bayesian decision-theoretic procedure using both the 3PL and 1PL models was also compared. Four criteria, including the percentages of correct classifications, false positive error rates, false negative error rates, and phi correlations between the true and observed classification status, were used to evaluate the results of this study. According to these criteria, the Bayesian decision-theoretic procedure appeared to effectively control false negative and false positive error rates. The differences in the percentages of correct classifications and phi correlations between true and predicted status for the Bayesian decision-theoretic procedures and conventional procedures were quite small. The results also showed that there was no consistent advantage for either the linear or threshold loss function. In relation to the four criteria used in this study, the values produced by these two loss functions were very similar. One of the purposes of this study was to extend the Bayesian procedure from the 1PL to the 3PL model. The results showed that when the datasets were simulated to fit the 3PL model, using the 1PL model in the Bayesian procedure yielded less accurate results. However, when the datasets were simulated to fit the 1PL model, using the 3PL model in the Bayesian procedure yielded reasonable classification accuracies in most cases. Thus, the use of the Bayesian decision-theoretic procedure with the 3PL model seemed quite promising in the context of fixed-length mastery tests. Mastery tests;Bayesian Statistics Loss Function Education
8	Testing specifications in partial observability models : a Bayesian encompassing approach Almeida, Carlos 04 October 2007 (has links) A structural approach for modelling a statistical problem permits to introduce a contextual theory based in previous knowledge. This approach makes the parameters completely meaningful; but, in the intermediate steps, some unobservable characteristics are introduced because of their contextual meaning. When the model is completely specified, the marginalisation into the observed variables is operated in order to obtain a tatistical model. The variables can be discrete or continuous both at the level of unobserved and at the level of observed or manifest variables. We are sometimes faced, especially in behavioural sciences, with ordinal variables; this is the case of the so-called Likert scales. Therefore, an ordinal variable could be nterpreted as a discrete version of a latent concept (the discretization model). The normality of the latent variables simplifies the study of this model into the analysis of the structure of the covariance matrix of the "ideally" measured variables, but only a sub-parameter of these matrix can be identified and consistently estimated (i.e. the matrix of polychoric correlations). Consequently, two questions rise here: Is the normality of the latent variables testable? If not, what is the aspect of this hypothesis which could be testable?. In the discretization model, we observe a loss of information with related to the information contained in the latent variables. In order to treat this situation we introduce the concept of partial observability through a (non bijective) measurable function of the latent variable. We explore this definition and verify that other models can be adjusted to this concept. The definition of partial observability permits us to distinguish between two cases depending on whether the involved function is or not depending on a Euclidean parameter. Once the partial observability is introduced, we expose a set of conditions for building a specification test at the level of latent variables. The test is built using the encompassing principle in a Bayesian framework. More precisely, the problem treated in this thesis is: How to test, in a Bayesian framework, the multivariate normality of a latent vector when only a discretized version of that vector is observed. More generally, the problem can be extended to (or re-paraphrased in): How to test, in Bayesian framework, a parametric specification on latent variables against a nonparametric alternative when only a partial observation of these latent variables is available. Bayesian statistics Partial observability Encompassing test
9	Bayesian analysis of some pricing and discounting models Zantedeschi, Daniel 13 July 2012 (has links) The dissertation comprises an introductory Chapter, four papers and a summary Chapter. First, a new class of Bayesian dynamic partition models for the Nelson- Siegel family of non-linear state-space Bayesian statistical models is developed. This class is applied to studying the term structure of government yields. A sequential time series of Bayes factors, which is developed from this approach, shows that term structure could act as a leading indicator of economic activity. Second, we develop a class of non-MCMC algorithms called “Direct Sampling”. This Chapter extends the basic algorithm with applications to Generalized Method of Moments and Affine Term Structure Models. Third, financial economics is characterized by long-standing problems such as the equity premium and risk free rate puzzles. In the chapter titled “Bayesian Learning, Distributional Uncertainty and Asset-Return Puzzles” solutions for equilibrium prices under a set of subjective beliefs generated by Dirichlet Process priors are developed. It is shown that the “puzzles” could disappear if a “tail thickening” effect is induced by the representative agent. A novel Bayesian methodology for retrospective calibration of the model from historical data is developed. This approach shows how predictive functionals have important welfare implications towards long-term growth. Fourth, in “Social Discounting Using a Bayesian Nonparametric model” the problem of how to better quantify the uncertainty in long-term investments is considered from a Bayesian perspective. By incorporating distribution uncertainty, we are able to provide confidence measures that are less “pessimistic” when compared to previous studies. These measures shed a new and different light when considering important cost-benefit analysis such as the valuation of environmental policies towards the resolution of global warming. Finally, the last Chapter discusses directions for future research and concludes the dissertation. / text Bayesian statistics Bayesian learning Pricing Discounting
10	Municipal-level estimates of child mortality for Brazil : a new approach using Bayesian statistics McKinnon, Sarah Ann 14 December 2010 (has links) Current efforts to measure child mortality for municipalities in Brazil are hampered by the relative rarity of child deaths, which often results in unstable and unreliable estimates. As a result, it is not possible to accurately assess true levels of child mortality for many areas, hindering efforts towards constructing and implementing effective policy initiatives for the reduction of child mortality. However, with a spatial smoothing process based upon Bayesian Statistics it is possible to “borrow” information from neighboring areas in order to generate more stable and accurate estimates of mortality in smaller areas. The objective of this study is to use this spatial smoothing process to derive estimates of child mortality at the level of the municipality in Brazil. Using data from the 2000 Brazil Census, I derive both Bayesian and non-Bayesian estimates of mortality for each municipality. In comparing the smoothed and raw estimates of this parameter, I find that the Bayesian estimates yield a clearer spatial pattern of child mortality with smaller variances in less populated municipalities, thus, more accurately reflecting the true mortality situation of those municipalities. These estimates can then be used, ultimately, to lead to more effective policies and health initiatives in the fight for the reduction of child mortality in Brazil. / text Child mortality Brazil Bayesian statistics Spatial smoothing

Search results