Spelling suggestions: "subject:"[een] BAYESIAN INFERENCE"" "subject:"[enn] BAYESIAN INFERENCE""
101 |
Bayesian Inference Approaches for Particle Trajectory Analysis in Cell BiologyMonnier, Nilah 28 August 2013 (has links)
Despite the importance of single particle motion in biological systems, systematic inference approaches to analyze particle trajectories and evaluate competing motion models are lacking. An automated approach for robust evaluation of motion models that does not require manual intervention is highly desirable to enable analysis of datasets from high-throughput imaging technologies that contain hundreds or thousands of trajectories of biological particles, such as membrane receptors, vesicles, chromosomes or kinetochores, mRNA particles, or whole cells in developing embryos. Bayesian inference is a general theoretical framework for performing such model comparisons that has proven successful in handling noise and experimental limitations in other biological applications. The inherent Bayesian penalty on model complexity, which avoids overfitting, is particularly important for particle trajectory analysis given the highly stochastic nature of particle diffusion. This thesis presents two complementary approaches for analyzing particle motion using Bayesian inference. The first method, MSD-Bayes, discriminates a wide range of motion models--including diffusion, directed motion, anomalous and confined diffusion--based on mean- square displacement analysis of a set of particle trajectories, while the second method, HMM-Bayes, identifies dynamic switching between diffusive and directed motion along individual trajectories using hidden Markov models. These approaches are validated on biological particle trajectory datasets from a wide range of experimental systems, demonstrating their broad applicability to research in cell biology.
|
102 |
Function-on-Function Regression with Public Health ApplicationsMeyer, Mark John 06 June 2014 (has links)
Medical research currently involves the collection of large and complex data. One such type of data is functional data where the unit of measurement is a curve measured over a grid. Functional data comes in a variety of forms depending on the nature of the research. Novel methodologies are required to accommodate this growing volume of functional data alongside new testing procedures to provide valid inferences. In this dissertation, I propose three novel methods to accommodate a variety of questions involving functional data of multiple forms. I consider three novel methods: (1) a function-on-function regression for Gaussian data; (2) a historical functional linear models for repeated measures; and (3) a generalized functional outcome regression for ordinal data. For each method, I discuss the existing shortcomings of the literature and demonstrate how my method fills those gaps. The abilities of each method are demonstrated via simulation and data application.
|
103 |
Accelerating Markov chain Monte Carlo via parallel predictive prefetchingAngelino, Elaine Lee 21 October 2014 (has links)
We present a general framework for accelerating a large class of widely used Markov chain Monte Carlo (MCMC) algorithms. This dissertation demonstrates that MCMC inference can be accelerated in a model of parallel computation that uses speculation to predict and complete computational work ahead of when it is known to be useful. By exploiting fast, iterative approximations to the target density, we can speculatively evaluate many potential future steps of the chain in parallel. In Bayesian inference problems, this approach can accelerate sampling from the target distribution, without compromising exactness, by exploiting subsets of data. It takes advantage of whatever parallel resources are available, but produces results exactly equivalent to standard serial execution. In the initial burn-in phase of chain evaluation, it achieves speedup over serial evaluation that is close to linear in the number of available cores. / Engineering and Applied Sciences
|
104 |
Bayesian Data Association for Temporal Scene UnderstandingBrau Avila, Ernesto January 2013 (has links)
Understanding the content of a video sequence is not a particularly difficult problem for humans. We can easily identify objects, such as people, and track their position and pose within the 3D world. A computer system that could understand the world through videos would be extremely beneficial in applications such as surveillance, robotics, biology. Despite significant advances in areas like tracking and, more recently, 3D static scene understanding, such a vision system does not yet exist. In this work, I present progress on this problem, restricted to videos of objects that move in smoothly and which are relatively easily detected, such as people. Our goal is to identify all the moving objects in the scene and track their physical state (e.g., their 3D position or pose) in the world throughout the video. We develop a Bayesian generative model of a temporal scene, where we separately model data association, the 3D scene and imaging system, and the likelihood function. Under this model, the video data is the result of capturing the scene with the imaging system, and noisily detecting video features. This formulation is very general, and can be used to model a wide variety of scenarios, including videos of people walking, and time-lapse images of pollen tubes growing in vitro. Importantly, we model the scene in world coordinates and units, as opposed to pixels, allowing us to reason about the world in a natural way, e.g., explaining occlusion and perspective distortion. We use Gaussian processes to model motion, and propose that it is a general and effective way to characterize smooth, but otherwise arbitrary, trajectories. We perform inference using MCMC sampling, where we fit our model of the temporal scene to data extracted from the videos. We address the problem of variable dimensionality by estimating data association and integrating out all scene variables. Our experiments show our approach is competitive, producing results which are comparable to state-of-the-art methods.
|
105 |
Latent Conditional Individual-Level Models and Related Topics in Infectious Disease ModelingDeeth, Lorna E. 15 October 2012 (has links)
Individual-level models are a class of complex statistical models, often fitted within a Bayesian Markov chain Monte Carlo framework, that have been effectively used to model the spread of infectious diseases. The ability of these models to incorporate individual-level covariate information allows them to be highly flexible, and to account for such characteristics as population heterogeneity. However, these models can be subject to inherent uncertainties often found in infectious disease data. As well, their complex nature can lead to a significant computational expense when fitting these models to epidemic data, particularly for large populations.
An individual-level model that incorporates a latent grouping structure into the modeling procedure, based on some heterogeneous population characteristics, is investigated. The dependence of this latent conditional individual-level model on a discrete latent grouping variable alleviates the need for explicit, although possibly unreliable, covariate information. A simulation study is used to assess the posterior predictive ability of this model, in comparison to individual-level models that utilize the full covariate information, or that assume population homogeneity. These models are also applied to data from the 2001 UK foot-and-mouth disease epidemic.
When attempting to compare complex models fitted within the Bayesian framework, the identification of appropriate model selection tools would be beneficial. The use of deviance information criterion (DIC) as model comparison tool, particularly for the latent conditional individual-level models, is investigated. A simulation study is used to compare five variants of the DIC, and the ability of each DIC variant to select the true model is determined.
Finally, an investigation into methods to reduce the computational burden associated with individual-level models is carried out, based on an individual-level model that also incorporates population heterogeneity through a discrete grouping variable. A simulation study is used to determine the effect of reducing the overall population size by aggregating the data into spatial clusters. Reparameterized individual-level models, accounting for the aggregation effect, are fitted to the aggregated data. The effect of data aggregation on the ability of two reparameterized individual-level models to identify a covariate effect, as well as on the computational expense of the model fitting procedure, is explored.
|
106 |
Issues of Computational Efficiency and Model Approximation for Spatial Individual-Level Infectious Disease ModelsDobbs, Angie 06 January 2012 (has links)
Individual-level models (ILMs) are models that can use the spatial-temporal nature of disease data to capture the disease dynamics. Parameter estimation is usually done via Markov chain Monte Carlo (MCMC) methods, but correlation between model parameters negatively affects MCMC mixing. Introducing a normalization constant to alleviate the correlation results in MCMC convergence over fewer iterations, however this negatively effects computation time.
It is important that model fitting is done as efficiently as possible. An upper-truncated distance kernel is introduced to quicken the computation of the likelihood, but this causes a loss in goodness-of-fit.
The normalization constant and upper-truncated distance kernel are evaluated as components in various ILMs via a simulation study. The normalization constant is seen not to be worthwhile, as the effect of increased computation time is not outweighed by the reduced correlation. The upper-truncated distance kernel reduces computation time but worsens model fit as the truncation distance decreases. / Studies have been funded by OMAFRA & NSERC, with computing equipment provided by CSI.
|
107 |
Investigation of Genomic Estimated Breeding Values and Association Methodologies using Bayesian Inference in a Nellore-Angus Crossbred Population for Two TraitsHulsman, Lauren Lorene 16 December 2013 (has links)
The objectives of this study were to 1) evaluate marker associations for genomic regions of interest and significant ontology terms, 2) evaluate and compare 4 models for their efficacy in predicting genetic merit, 3) evaluate and compare the impact of using breed-of-origin genotypes in a Bayesian prediction model, and 4) evaluate the effects of data partitioning using family structure on predictions. Nellore-Angus F2, F3 and half-sibling calves were used with records for overall temperament at weaning (OTW; a subjective scoring system; n = 769) and Warner-Bratzler shear force (WBSF; a measure of tenderness; n = 389). After filtering, 34,913 markers were available for use. Bayesian methods employed were BayesB (using ̂) and BayesC (using π = 0 and ̂) in GenSel software, where, after estimation, π ̂ = 0.995 or 0.997 for WBSF or OTW, respectively. No regions associated with either trait were found using π ̂, but when π = 0 associated regions were identified (37 and 147 regions for OTW and WBSF, respectively). Comparison of genomic estimated breeding values from these 3 Bayesian models to an animal model showed that BayesC procedures (using ̂) had the highest accuracy for both traits, but that BayesB had the lowest indication of bias in either case. Using a subset of the population (n = 440), genotypes based on the breed in which the alleles originated from (i.e., breed-of-origin genotypes) were assigned to markers mapped to autosomes (n = 34,449), and incorporated into prediction analyses using BayesB (π ̂ = 0.997) with or without nucleotide-based genotypes. In either case, there was an increase in accuracy when breed-of-origin genotypes were incorporated into prediction analyses. Data partitions based on family structure resulted in 13 distinct training and validations groups. Relationship of individuals in the training with validation individuals did have an impact in some cases, but not all. There was poor prediction of genomic estimated breeding values for individuals in the validation population using BayesB methods, but performed better in all cases than breeding values generated using an animal model. Future studies incorporating breed-of-origin genotypes are of interest to determine if accuracy is improved in these groups.
|
108 |
Transfer learning with Gaussian processesSkolidis, Grigorios January 2012 (has links)
Transfer Learning is an emerging framework for learning from data that aims at intelligently transferring information between tasks. This is achieved by developing algorithms that can perform multiple tasks simultaneously, as well as translating previously acquired knowledge to novel learning problems. In this thesis, we investigate the application of Gaussian Processes to various forms of transfer learning with a focus on classification problems. This process initiates with a thorough introduction to the framework of Transfer learning, providing a clear taxonomy of the areas of research. Following that, we continue by reviewing the recent advances on Multi-task learning for regression with Gaussian processes, and compare the performance of some of these methods on a real data set. This review gives insights about the strengths and weaknesses of each method, which acts as a point of reference to apply these methods to other forms of transfer learning. The main contributions of this thesis are reported in the three following chapters. The third chapter investigates the application of Multi-task Gaussian processes to classification problems. We extend a previously proposed model to the classification scenario, providing three inference methods due to the non-Gaussian likelihood the classification paradigm imposes. The forth chapter extends the multi-task scenario to the semi-supervised case. Using labeled and unlabeled data, we construct a novel covariance function that is able to capture the geometry of the distribution of each task. This setup allows unlabeled data to be utilised to infer the level of correlation between the tasks. Moreover, we also discuss the potential use of this model to situations where no labeled data are available for certain tasks. The fifth chapter investigates a novel form of transfer learning called meta-generalising. The question at hand is if, after training on a sufficient number of tasks, it is possible to make predictions on a novel task. In this situation, the predictor is embedded in an environment of multiple tasks but has no information about the origins of the test task. This elevates the concept of generalising from the level of data to the level of tasks. We employ a model based on a hierarchy of Gaussian processes, in a mixtures of expert sense, to make predictions based on the relation between the distributions of the novel and the training tasks. Each chapter is accompanied with a thorough experimental part giving insights about the potentials and the limits of the proposed methods.
|
109 |
Bayesian Methods for Two-Sample ComparisonSoriano, Jacopo January 2015 (has links)
<p>Two-sample comparison is a fundamental problem in statistics. Given two samples of data, the interest lies in understanding whether the two samples were generated by the same distribution or not. Traditional two-sample comparison methods are not suitable for modern data where the underlying distributions are multivariate and highly multi-modal, and the differences across the distributions are often locally concentrated. The focus of this thesis is to develop novel statistical methodology for two-sample comparison which is effective in such scenarios. Tools from the nonparametric Bayesian literature are used to flexibly describe the distributions. Additionally, the two-sample comparison problem is decomposed into a collection of local tests on individual parameters describing the distributions. This strategy not only yields high statistical power, but also allows one to identify the nature of the distributional difference. In many real-world applications, detecting the nature of the difference is as important as the existence of the difference itself. Generalizations to multi-sample comparison and more complex statistical problems, such as multi-way analysis of variance, are also discussed.</p> / Dissertation
|
110 |
Integrated modelling and Bayesian inference applied to population and disease dynamics in wildlife : M.bovis in badgers in Woodchester ParkZijerveld, Leonardus Jacobus Johannes January 2013 (has links)
Understanding demographic and disease processes in wildlife populations tends to be hampered by incomplete observations which can include significant errors. Models provide useful insights into the potential impacts of key processes and the value of such models greatly improves through integration with available data in a way that includes all sources of stochasticity and error. To date, the impact on disease of spatial and social structures observed in wildlife populations has not been widely addressed in modelling. I model the joint effects of differential fecundity and spatial heterogeneity on demography and disease dynamics, using a stochastic description of births, deaths, social-geographic migration, and disease transmission. A small set of rules governs the rates of births and movements in an environment where individuals compete for improved fecundity. This results in realistic population structures which, depending on the mode of disease transmission can have a profound effect on disease persistence and therefore has an impact on disease control strategies in wildlife populations. I also apply a simple model with births, deaths and disease events to the long-term observations of TB (Mycobacterium bovis) in badgers in Woodchester Park. The model is a continuous time, discrete state space Markov chain and is fitted to the data using an implementation of Bayesian parameter inference with an event-based likelihood. This provides a flexible framework to combine data with expert knowledge (in terms of model structure and prior distributions of parameters) and allows us to quantify the model parameters and their uncertainties. Ecological observations tend to be restricted in terms of scope and spatial temporal coverage and estimates are also affected by trapping efficiency and disease test sensitivity. My method accounts for such limitations as well as the stochastic nature of the processes. I extend the likelihood function by including an error term that depends on the difference between observed and inferred state space variables. I also demonstrate that the estimates improve by increasing observation frequency, combining the likelihood of more than one group and including variation of parameter values through the application of hierarchical priors.
|
Page generated in 0.0347 seconds