Global ETD Search

1	Bayesian Model Selection for Spatial Data and Cost-constrained Applications Porter, Erica May 03 July 2023 (has links) Bayesian model selection is a useful tool for identifying an appropriate model class, dependence structure, and valuable predictors for a wide variety of applications. In this work we consider objective Bayesian model selection where no subjective information is available to inform priors on model parameters a priori, specifically in the case of hierarchical models for spatial data, which can have complex dependence structures. We develop an approach using trained priors via fractional Bayes factors where standard Bayesian model selection methods fail to produce valid probabilities under improper reference priors. This enables researchers to concurrently determine whether spatial dependence between observations is apparent and identify important predictors for modeling the response. In addition to model selection with objective priors on model parameters, we also consider the case where the priors on the model space are used to penalize individual predictors a priori based on their costs. We propose a flexible approach that introduces a tuning parameter to cost-penalizing model priors that allows researchers to control the level of cost penalization to meet budget constraints and accommodate increasing sample sizes. / Doctor of Philosophy / Spatial data, such as data collected over a geographic region, is relevant in many fields. Spatial data can require complex models to study, but use of these models can impose unnecessary computations and increased difficulty for interpretation when spatial dependence is weak or not present. We develop a method to simultaneously determine whether a spatial model is necessary to understand the data and choose important variables associated with the outcome of interest. Within a class of simpler, linear models, we propose a technique to identify important variables associated with an outcome when there exists a budget or general desire to minimize the cost of collecting the variables. Spatial statistics Bayesian model selection
2	A comparison of Bayesian model selection based on MCMC with an application to GARCH-type models Miazhynskaia, Tatiana, Frühwirth-Schnatter, Sylvia, Dorffner, Georg January 2003 (has links) (PDF) This paper presents a comprehensive review and comparison of five computational methods for Bayesian model selection, based on MCMC simulations from posterior model parameter distributions. We apply these methods to a well-known and important class of models in financial time series analysis, namely GARCH and GARCH-t models for conditional return distributions (assuming normal and t-distributions). We compare their performance vis--vis the more common maximum likelihood-based model selection on both simulated and real market data. All five MCMC methods proved feasible in both cases, although differing in their computational demands. Results on simulated data show that for large degrees of freedom (where the t-distribution becomes more similar to a normal one), Bayesian model selection results in better decisions in favour of the true model than maximum likelihood. Results on market data show the feasibility of all model selection methods, mainly because the distributions appear to be decisively non-Gaussian. / Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
3	Chromosome 3D Structure Modeling and New Approaches For General Statistical Inference Rongrong Zhang (5930474) 03 January 2019 (has links) <div>This thesis consists of two separate topics, which include the use of piecewise helical models for the inference of 3D spatial organizations of chromosomes and new approaches for general statistical inference. The recently developed Hi-C technology enables a genome-wide view of chromosome</div><div>spatial organizations, and has shed deep insights into genome structure and genome function. However, multiple sources of uncertainties make downstream data analysis and interpretation challenging. Specically, statistical models for inferring three-dimensional (3D) chromosomal structure from Hi-C data are far from their maturity. Most existing methods are highly over-parameterized, lacking clear interpretations, and sensitive to outliers. We propose a parsimonious, easy to interpret, and robust piecewise helical curve model for the inference of 3D chromosomal structures</div><div>from Hi-C data, for both individual topologically associated domains and whole chromosomes. When applied to a real Hi-C dataset, the piecewise helical model not only achieves much better model tting than existing models, but also reveals that geometric properties of chromatin spatial organization are closely related to genome function.</div><div><br></div><div><div>For potential applications in big data analytics and machine learning, we propose to use deep neural networks to automate the Bayesian model selection and parameter estimation procedures. Two such frameworks are developed under different scenarios. First, we construct a deep neural network-based Bayes estimator for the parameters of a given model. The neural Bayes estimator mitigates the computational challenges faced by traditional approaches for computing Bayes estimators. When applied to the generalized linear mixed models, the neural Bayes estimator</div><div>outperforms existing methods implemented in R packages and SAS procedures. Second, we construct a deep convolutional neural networks-based framework to perform</div><div>simultaneous Bayesian model selection and parameter estimation. We refer to the neural networks for model selection and parameter estimation in the framework as the</div><div>neural model selector and parameter estimator, respectively, which can be properly trained using labeled data systematically generated from candidate models. Simulation</div><div>study shows that both the neural selector and estimator demonstrate excellent performances.</div></div><div><br></div><div><div>The theory of Conditional Inferential Models (CIMs) has been introduced to combine information for efficient inference in the Inferential Models framework for priorfree</div><div>and yet valid probabilistic inference. While the general theory is subject to further development, the so-called regular CIMs are simple. We establish and prove a</div><div>necessary and sucient condition for the existence and identication of regular CIMs. More specically, it is shown that for inference based on a sample from continuous</div><div>distributions with unknown parameters, the corresponding CIM is regular if and only if the unknown parameters are generalized location and scale parameters, indexing</div><div>the transformations of an affine group.</div></div> Statistics big data analytics machine learning deep neural networks bayesian model selection
4	Population SAMC, ChIP-chip Data Analysis and Beyond Wu, Mingqi 2010 December 1900 (has links) This dissertation research consists of two topics, population stochastics approximation Monte Carlo (Pop-SAMC) for Baysian model selection problems and ChIP-chip data analysis. The following two paragraphs give a brief introduction to each of the two topics, respectively. Although the reversible jump MCMC (RJMCMC) has the ability to traverse the space of possible models in Bayesian model selection problems, it is prone to becoming trapped into local mode, when the model space is complex. SAMC, proposed by Liang, Liu and Carroll, essentially overcomes the difficulty in dimension-jumping moves, by introducing a self-adjusting mechanism. However, this learning mechanism has not yet reached its maximum efficiency. In this dissertation, we propose a Pop-SAMC algorithm; it works on population chains of SAMC, which can provide a more efficient self-adjusting mechanism and make use of crossover operator from genetic algorithms to further increase its efficiency. Under mild conditions, the convergence of this algorithm is proved. The effectiveness of Pop-SAMC in Bayesian model selection problems is examined through a change-point identification example and a large-p linear regression variable selection example. The numerical results indicate that Pop- SAMC outperforms both the single chain SAMC and RJMCMC significantly. In the ChIP-chip data analysis study, we developed two methodologies to identify the transcription factor binding sites: Bayesian latent model and population-based test. The former models the neighboring dependence of probes by introducing a latent indicator vector; The later provides a nonparametric method for evaluation of test scores in a multiple hypothesis test by making use of population information of samples. Both methods are applied to real and simulated datasets. The numerical results indicate the Bayesian latent model can outperform the existing methods, especially when the data contain outliers, and the use of population information can significantly improve the power of multiple hypothesis tests. Markov Chain Monte Carlo Stochastic Approximation Metropolis-Hastings Algorithm Bayesian Model Selection ChIP-chip Latent Variable Multiple Hypothesis Test
5	Bayesian Model Selections for Log-binomial Regression Zhou, Wei January 2018 (has links) No description available. Statistics Log-binomial Regression Bayesian Model Selection Bayesian Variable Selection Monte Carlo methods Bayes factor Relative Risk
6	Applying Model Selection on Ligand-Target Binding Kinetic Analysis / Tillämpad Bayesiansk statistik för modellval inom interaktionsanalys Djurberg, Klara January 2021 (has links) The time-course of interaction formation or breaking can be studied using LigandTracer, and the data obtained from an experiment can be analyzed using a model of ligand-target binding kinetics. There are different kinetic models, and the choice of model is currently motivated by knowledge about the interaction, which is problematic when the knowledge about the interaction is unsatisfactory. In this project, a Bayesian model selection procedure was implemented to motivate the model choice using the data obtained from studying a biological system. The model selection procedure was implemented for four kinetic models, the 1:1 model, the 1:2 model, the bivalent model and a new version of the bivalent model.Bayesian inference was performed on the data using each of the models to obtain the posterior distributions of the parameters. Afterwards, the Bayes factor was approximated from numerical calculations of the marginal likelihood. Four numerical methods were implemented to approximate the marginal likelihood, the Naïve Monte Carlo estimator, the method of Harmonic Means of the likelihood, Importance Sampling and Sequential Monte Carlo. When tested on simulated data, the method of Importance Sampling seemed to yield the most reliable prediction of the most likely model. The model selection procedure was then tested on experimental data which was expected to be from a 1:1 interaction and the result of the model selection procedure did not agree with the expectation on the experimental test dataset. Therefore no reliable conclusion could be made when the model selection procedure was used to analyze the interaction between the anti-CD20 antibody Rituximab and Daudi cells. / Interaktioner kan analyseras med hjälp av LigandTracer. Data från ett LigandTracer experiment kan sedan analyseras med avseende på en kinetisk modell. Det finns olika kinetiska modeller, och modellvalet motiveras vanligen utifrån tidigare kunskap om interaktionen, vilket är problematiskt när den tillgängliga informationen om en interaktion är otillräcklig. I det här projektet implementerades en Bayesiansk metod för att motivera valet av modell utifrån data från ett LigandTracer experiment. Modellvalsmetoden implementerades för fyra kinetiska modeller, 1:1 modellen, 1:2 modellen, den bivalenta modellen och en ny version av den bivalenta modellen. Bayesiansk inferens användes för att få fram aposteriorifördelningarna för de olika modellernas parametrar utifrån den givna datan. Sedan beräknades Bayes faktor utifrån numeriska approximationer av marginalsannolikeheten. Fyra numeriska metoder implementerades för att approximera marginalsannolikheten; Naïve Monte Carlo estimator, det harmoniska medelvärdet av likelihood-funktionen, Importance Sampling och Sekventiell Monte Carlo. När modellvalsmetoden testades på simulerad data gav metoden Importance Sampling den mest tillförlitliga förutsägelsen om vilken modell som generade datan. Metoden testades också på experimentell data som förväntades följa en 1:1 interaktion och resultatet avvek från det förväntade resultatet. Följaktligen kunde ingen slutsas dras av resultet från modelvalsmetoden när den sedan används för att analysera interaktionen mellan anti-CD antikroppen Rituximab och Daudi-celler. LigandTracer kinetic models Rituximab Bayesian inference Bayesian model selection LigandTracer interaktionsmodeller Rituximab Bayesiansk inferens modellval Biochemistry and Molecular Biology Biokemi och molekylärbiologi
7	A Bayesian Inference/Maximum Entropy Approach for Optimization and Validation of Empirical Molecular Models Raddi, Robert, 0000-0001-7139-5028 05 1900 (has links) Accurate modeling of structural ensembles is essential for understanding molecular function, predicting molecular interactions, refining molecular potentials, protein engineering, drug discovery, and more. Here, we enhance molecular modeling through Bayesian Inference of Conformational Populations (BICePs), a highly versatile algorithm for reweighting simulated ensembles with experimental data. By incorporating replica-averaging, improved likelihood functions to better address systematic errors, and adopting variational optimization schemes, the utility of this algorithm in the refinement and validation of both structural ensembles and empirical models is unmatched. Utilizing a set of diverse experimental measurements, including NOE distances, chemical shifts, and vicinal J-coupling constants, we evaluated nine force fields for simulating the mini-protein chignolin, highlighting BICePs’ capability to correctly identify folded conformations and perform objective model selection. Additionally, we demonstrate how BICePs automates the parameterization of molecular potentials and forward models—computational frameworks that generate observable quantities—while properly accounting for all sources of random and systematic error. By reconciling prior knowledge of structural ensembles with solution-based experimental observations, BICePs not only offers a robust approach for evaluating the predictive accuracy of molecular models but also shows significant promise for future applications in computational chemistry and biophysics. / Chemistry Biophysics Statistical physics Computational chemistry Bayesian inference Bayesian model selection Conformational populations Empirical models Maximum entropy Variational optimization
8	Bayesian Model Selection for High-dimensional High-throughput Data Joshi, Adarsh 2010 May 1900 (has links) Bayesian methods are often criticized on the grounds of subjectivity. Furthermore, misspecified priors can have a deleterious effect on Bayesian inference. Noting that model selection is effectively a test of many hypotheses, Dr. Valen E. Johnson sought to eliminate the need of prior specification by computing Bayes' factors from frequentist test statistics. In his pioneering work that was published in the year 2005, Dr. Johnson proposed using so-called local priors for computing Bayes? factors from test statistics. Dr. Johnson and Dr. Jianhua Hu used Bayes' factors for model selection in a linear model setting. In an independent work, Dr. Johnson and another colleage, David Rossell, investigated two families of non-local priors for testing the regression parameter in a linear model setting. These non-local priors enable greater separation between the theories of null and alternative hypotheses. In this dissertation, I extend model selection based on Bayes' factors and use nonlocal priors to define Bayes' factors based on test statistics. With these priors, I have been able to reduce the problem of prior specification to setting to just one scaling parameter. That scaling parameter can be easily set, for example, on the basis of frequentist operating characteristics of the corresponding Bayes' factors. Furthermore, the loss of information by basing a Bayes' factors on a test statistic is minimal. Along with Dr. Johnson and Dr. Hu, I used the Bayes' factors based on the likelihood ratio statistic to develop a method for clustering gene expression data. This method has performed well in both simulated examples and real datasets. An outline of that work is also included in this dissertation. Further, I extend the clustering model to a subclass of the decomposable graphical model class, which is more appropriate for genotype data sets, such as single-nucleotide polymorphism (SNP) data. Efficient FORTRAN programming has enabled me to apply the methodology to hundreds of nodes. For problems that produce computationally harder probability landscapes, I propose a modification of the Markov chain Monte Carlo algorithm to extract information regarding the important network structures in the data. This modified algorithm performs well in inferring complex network structures. I use this method to develop a prediction model for disease based on SNP data. My method performs well in cross-validation studies. Bayes factors Bayes factors based on test statistics Bayesian Graphs MCMC Objective Bayesian Analysis Bayesian Model Selection Microarray data
9	Contributions to quality improvement methodologies and computer experiments Tan, Matthias H. Y. 16 September 2013 (has links) This dissertation presents novel methodologies for five problem areas in modern quality improvement and computer experiments, i.e., selective assembly, robust design with computer experiments, multivariate quality control, model selection for split plot experiments, and construction of minimax designs. Selective assembly has traditionally been used to achieve tight specifications on the clearance of two mating parts. Chapter 1 proposes generalizations of the selective assembly method to assemblies with any number of components and any assembly response function, called generalized selective assembly (GSA). Two variants of GSA are considered: direct selective assembly (DSA) and fixed bin selective assembly (FBSA). In DSA and FBSA, the problem of matching a batch of N components of each type to give N assemblies that minimize quality cost is formulated as axial multi-index assignment and transportation problems respectively. Realistic examples are given to show that GSA can significantly improve the quality of assemblies. Chapter 2 proposes methods for robust design optimization with time consuming computer simulations. Gaussian process models are widely employed for modeling responses as a function of control and noise factors in computer experiments. In these experiments, robust design optimization is often based on average quadratic loss computed as if the posterior mean were the true response function, which can give misleading results. We propose optimization criteria derived by taking expectation of the average quadratic loss with respect to the posterior predictive process, and methods based on the Lugannani-Rice saddlepoint approximation for constructing accurate credible intervals for the average loss. These quantities allow response surface uncertainty to be taken into account in the optimization process. Chapter 3 proposes a Bayesian method for identifying mean shifts in multivariate normally distributed quality characteristics. Multivariate quality characteristics are often monitored using a few summary statistics. However, to determine the causes of an out-of-control signal, information about which means shifted and the directions of the shifts is often needed. We propose a Bayesian approach that gives this information. For each mean, an indicator variable that indicates whether the mean shifted upwards, shifted downwards, or remained unchanged is introduced. Default prior distributions are proposed. Mean shift identification is based on the modes of the posterior distributions of the indicators, which are determined via Gibbs sampling. Chapter 4 proposes a Bayesian method for model selection in fractionated split plot experiments. We employ a Bayesian hierarchical model that takes into account the split plot error structure. Expressions for computing the posterior model probability and other important posterior quantities that require evaluation of at most two uni-dimensional integrals are derived. A novel algorithm called combined global and local search is proposed to find models with high posterior probabilities and to estimate posterior model probabilities. The proposed method is illustrated with the analysis of three real robust design experiments. Simulation studies demonstrate that the method has good performance. The problem of choosing a design that is representative of a finite candidate set is an important problem in computer experiments. The minimax criterion measures the degree of representativeness because it is the maximum distance of a candidate point to the design. Chapter 5 proposes algorithms for finding minimax designs for finite design regions. We establish the relationship between minimax designs and the classical set covering location problem in operations research, which is a binary linear program. We prove that the set of minimax distances is the set of discontinuities of the function that maps the covering radius to the optimal objective function value, and optimal solutions at the discontinuities are minimax designs. These results are employed to design efficient procedures for finding globally optimal minimax and near-minimax designs. Selective assembly Robust parameter design Quadratic loss Multivariate quality control Mean shift identification Split plot experiments Bayesian model selection Space-filling designs Minimax designs Sampling (Statistics) Mathematical statistics Quality control
10	Hippocampal-Temporopolar Connectivity Contributes to Episodic Simulation During Social Cognition Pehrs, Corinna, Zaki, Jamil, Taruffi, Liila, Kuchinke, Lars, Koelsch, Stefan 28 September 2018 (has links) People are better able to empathize with others when they are given information concerning the context driving that person’s experiences. This suggests that people draw on prior memories when empathizing, but the mechanisms underlying this connection remain largely unexplored. The present study investigates how variations in episodic information shape the emotional response towards a movie character. Episodic information is either absent or provided by a written context preceding empathic film clips. It was shown that sad context information increases empathic concern for a movie character. This was tracked by neural activity in the temporal pole (TP) and anterior hippocampus (aHP). Dynamic causal modeling with Bayesian Model Selection has shown that context changes the effective connectivity from left aHP to the right TP. The same crossed-hemispheric coupling was found during rest, when people are left to their own thoughts. We conclude that (i) that the integration of episodic memory also supports the specific case of integrating context into empathic judgments, (ii) the right TP supports emotion processing by integrating episodic memory into empathic inferences, and (iii) lateral integration is a key process for episodic simulation during rest and during task. We propose that a disruption of the mechanism may underlie empathy deficits in clinical conditions, such as autism spectrum disorder. info:eu-repo/classification/ddc/520 ddc:520

Search results