• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 104
  • 25
  • 23
  • 16
  • 3
  • 2
  • Tagged with
  • 224
  • 224
  • 36
  • 34
  • 28
  • 27
  • 26
  • 24
  • 23
  • 22
  • 22
  • 21
  • 19
  • 19
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bayes linear covariance matrix adjustment

Wilkinson, Darren James January 1995 (has links)
In this thesis, a Bayes linear methodology for the adjustment of covariance matrices is presented and discussed. A geometric framework for quantifying uncertainties about covariance matrices is set up, and an inner-product for spaces of random matrices is motivated and constructed. The inner-product on this space captures aspects of belief about the relationships between covariance matrices of interest, providing a structure rich enough to adjust beliefs about unknown matrices in the light of data such as sample covariance matrices, exploiting second-order exchangeability and related specifications to obtain representations allowing analysis. Adjustment is associated with orthogonal projection, and illustrated by examples for some common problems. The difficulties of adjusting the covariance matrices underlying exchangeable random vectors is tackled and discussed. Learning about the covariance matrices associated with multivariate time series dynamic linear models is shown to be amenable to a similar approach. Diagnostics for matrix adjustments are also discussed.
2

RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data

He, Yuting 29 April 2014 (has links)
Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Unlike research cell lines, clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. Result: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele frequency. We apply our model and identify twelve mutations in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are loss-of-heterozygosity events.
3

Bayesian Information Fusion for Precision Indoor Location

Cavanaugh, Andrew F 07 February 2011 (has links)
This thesis documents work which is part of the ongoing effort by the Worcester Polytechnic Institute (WPI) Precision Personnel Locator (PPL) project, to track and locate first responders in urban/indoor settings. Specifically, the project intends to produce a system which can accurately determine the floor that a person is on, as well as where on the floor that person is, with sub-meter accuracy. The system must be portable, rugged, fast to set up, and require no pre-installed infrastructure. Several recent advances have enabled us to get closer to meeting these goals: The development of Transactional Array Reconciliation Tomography(TART) algorithm, and corresponding locator hardware, as well as the integration of barometric sensors, and a new antenna deployment scheme. To fully utilize these new capabilities, a Bayesian Fusion algorithm has been designed. The goal of this thesis is to present the necessary methods for incorporating diverse sources of information, in a constructive manner, to improve the performance of the PPL system. While the conceptual methods presented within are meant to be general, the experimental results will focus on the fusion of barometric height estimates and RF data. These information sources will be processed with our existing Singular Value Array Reconciliation Tomography (σART), and the new TART algorithm, using a Bayesian Fusion algorithm to more accurately estimate indoor locations.
4

Bayesian Logistic Regression with Spatial Correlation: An Application to Tennessee River Pollution

Marjerison, William M 15 December 2006 (has links)
"We analyze data (length, weight and location) from a study done by the Army Corps of Engineers along the Tennessee River basin in the summer of 1980. The purpose is to predict the probability that a hypothetical channel catfish at a location studied is toxic and contains 5 ppm or more DDT in its filet. We incorporate spatial information and treate it separetely from other covariates. Ultimately, we want to predict the probability that a catfish from the unobserved location is toxic. In a preliminary analysis, we examine the data for observed locations using frequentist logistic regression, Bayesian logistic regression, and Bayesian logistic regression with random effects. Later we develop a parsimonious extension of Bayesian logistic regression and the corresponding Gibbs sampler for that model to increase computational feasibility and reduce model parameters. Furthermore, we develop a Bayesian model to impute data for locations where catfish were not observed. A comparison is made between results obtained fitting the model to only observed data and data with missing values imputed. Lastly, a complete model is presented which imputes data for missing locations and calculates the probability that a catfish from the unobserved location is toxic at once. We conclude that length and weight of the fish have negligible effect on toxicity. Toxicity of these catfish are mostly explained by location and spatial effects. In particular, the probability that a catfish is toxic decreases as one moves further downstream from the source of pollution."
5

An investigation of a Bayesian decision-theoretic procedure in the context of mastery tests

Hsieh, Ming-Chuan 01 January 2007 (has links)
The purpose of this study was to extend Glas and Vos's (1998) Bayesian procedure to the 3PL IRT model by using the MCMC method. In the context of fixed-length mastery tests, the Bayesian decision-theoretic procedure was compared with two conventional procedures (conventional- Proportion Correct and conventional- EAP) across different simulation conditions. Several simulation conditions were investigated, including two loss functions (linear and threshold loss function), three item pools (high discrimination, moderate discrimination and real item pool) and three test lengths (20, 40 and 60). Different loss parameters were manipulated in the Bayesian decision-theoretic procedure to examine the effectiveness of controlling false positive and false negative errors. The degree of decision accuracy for the Bayesian decision-theoretic procedure using both the 3PL and 1PL models was also compared. Four criteria, including the percentages of correct classifications, false positive error rates, false negative error rates, and phi correlations between the true and observed classification status, were used to evaluate the results of this study. According to these criteria, the Bayesian decision-theoretic procedure appeared to effectively control false negative and false positive error rates. The differences in the percentages of correct classifications and phi correlations between true and predicted status for the Bayesian decision-theoretic procedures and conventional procedures were quite small. The results also showed that there was no consistent advantage for either the linear or threshold loss function. In relation to the four criteria used in this study, the values produced by these two loss functions were very similar. One of the purposes of this study was to extend the Bayesian procedure from the 1PL to the 3PL model. The results showed that when the datasets were simulated to fit the 3PL model, using the 1PL model in the Bayesian procedure yielded less accurate results. However, when the datasets were simulated to fit the 1PL model, using the 3PL model in the Bayesian procedure yielded reasonable classification accuracies in most cases. Thus, the use of the Bayesian decision-theoretic procedure with the 3PL model seemed quite promising in the context of fixed-length mastery tests.
6

Municipal-level estimates of child mortality for Brazil : a new approach using Bayesian statistics

McKinnon, Sarah Ann 14 December 2010 (has links)
Current efforts to measure child mortality for municipalities in Brazil are hampered by the relative rarity of child deaths, which often results in unstable and unreliable estimates. As a result, it is not possible to accurately assess true levels of child mortality for many areas, hindering efforts towards constructing and implementing effective policy initiatives for the reduction of child mortality. However, with a spatial smoothing process based upon Bayesian Statistics it is possible to “borrow” information from neighboring areas in order to generate more stable and accurate estimates of mortality in smaller areas. The objective of this study is to use this spatial smoothing process to derive estimates of child mortality at the level of the municipality in Brazil. Using data from the 2000 Brazil Census, I derive both Bayesian and non-Bayesian estimates of mortality for each municipality. In comparing the smoothed and raw estimates of this parameter, I find that the Bayesian estimates yield a clearer spatial pattern of child mortality with smaller variances in less populated municipalities, thus, more accurately reflecting the true mortality situation of those municipalities. These estimates can then be used, ultimately, to lead to more effective policies and health initiatives in the fight for the reduction of child mortality in Brazil. / text
7

Bayesian inference for models with infinite-dimensionally generated intractable components

Villalobos, Isadora Antoniano January 2012 (has links)
No description available.
8

Bayesian matrix factorisation : inference, priors, and data integration

Brouwer, Thomas Alexander January 2017 (has links)
In recent years the amount of biological data has increased exponentially. Most of these data can be represented as matrices relating two different entity types, such as drug-target interactions (relating drugs to protein targets), gene expression profiles (relating drugs or cell lines to genes), and drug sensitivity values (relating drugs to cell lines). Not only the size of these datasets is increasing, but also the number of different entity types that they relate. Furthermore, not all values in these datasets are typically observed, and some are very sparse. Matrix factorisation is a popular group of methods that can be used to analyse these matrices. The idea is that each matrix can be decomposed into two or more smaller matrices, such that their product approximates the original one. This factorisation of the data reveals patterns in the matrix, and gives us a lower-dimensional representation. Not only can we use this technique to identify clusters and other biological signals, we can also predict the unobserved entries, allowing us to prune biological experiments. In this thesis we introduce and explore several Bayesian matrix factorisation models, focusing on how to best use them for predicting these missing values in biological datasets. Our main hypothesis is that matrix factorisation methods, and in particular Bayesian variants, are an extremely powerful paradigm for predicting values in biological datasets, as well as other applications, and especially for sparse and noisy data. We demonstrate the competitiveness of these approaches compared to other state-of-the-art methods, and explore the conditions under which they perform the best. We consider several aspects of the Bayesian approach to matrix factorisation. Firstly, the effect of inference approaches that are used to find the factorisation on predictive performance. Secondly, we identify different likelihood and Bayesian prior choices that we can use for these models, and explore when they are most appropriate. Finally, we introduce a Bayesian matrix factorisation model that can be used to integrate multiple biological datasets, and hence improve predictions. This model hybridly combines different matrix factorisation models and Bayesian priors. Through these models and experiments we support our hypothesis and provide novel insights into the best ways to use Bayesian matrix factorisation methods for predictive purposes.
9

Uncertainty in inverse elasticity problems

Gendin, Daniel I. 27 September 2021 (has links)
The non-invasive differential diagnosis of breast masses through ultrasound imaging motivates the following class of elastic inverse problems: Given one or more measurements of the displacement field within an elastic material, determine the material property distribution within the material. This thesis is focused on uncertainty quantification in inverse problem solutions, with application to inverse problems in linear and nonlinear elasticity. We consider the inverse nonlinear elasticity problem in the context of Bayesian statistics. We show the well-known result that computing the Maximum A Posteriori (MAP) estimate is consistent with previous optimization formulations of the inverse elasticity problem. We show further that certainty in this estimate may be quantified using concepts from information theory, specifically, information gain as measured by the Kullback-Leibler (K-L) divergence and mutual information. A particular challenge in this context is the computational expense associated with computing these quantities. A key contribution of this work is a novel approach that exploits the mathematical structure of the inverse problem and properties of conjugate gradient method to make these calculations feasible. A focus of this work is estimating the spatial distribution of the elastic nonlinearity of a material. Measurement sensitivity to the nonlinearity is much higher for large (finite) strains than for smaller strains, and so large strains tend to be used for such measurements. Measurements of larger deformations, however, tend to show greater levels of noise. A key finding of this work is that, when identifying nonlinear elastic properties, information gain can be used to characterize a trade-off between larger strains with higher noise levels and smaller strains with lower noise levels. These results can be used to inform experimental design. An approach often used to estimate both linear and nonlinear elastic property distributions is to do so sequentially: Use a small strain deformation to estimate the linear properties, and a large strain deformation to estimate the nonlinearity. A key finding of this work is that accurate characterization of the joint posterior probability distribution over both linear and nonlinear elastic parameters requires that the estimates be performed jointly rather than sequentially. All the methods described above are demonstrated in applications to problems in elasticity for both simulated data as well as clinically measured data (obtained in vivo). In the context of the clinical data, we evaluate repeatability of measurements and parameter reconstructions in a clinical setting.
10

Making Sense of the Noise: Statistical Analysis of Environmental DNA Sampling for Invasive Asian Carp Monitoring Near the Great Lakes

Song, Jeffery W. 01 May 2017 (has links)
Sensitive and accurate detection methods are critical for monitoring and managing the spread of aquatic invasive species, such as invasive Silver Carp (SC; Hypophthalmichthys molitrix) and Bighead Carp (BH; Hypophthalmichthys nobilis) near the Great Lakes. A new detection tool called environmental DNA (eDNA) sampling, the collection and screening of water samples for the presence of the target species’ DNA, promises improved detection sensitivity compared to conventional surveillance methods. However, the application of eDNA sampling for invasive species management has been challenging due to the potential of false positives, from detecting species’ eDNA in the absence of live organisms. In this dissertation, I study the sources of error and uncertainty in eDNA sampling and develop statistical tools to show how eDNA sampling should be utilized for monitoring and managing invasive SC and BH in the United States. In chapter 2, I investigate the environmental and hydrologic variables, e.g. reverse flow, that may be contributing to positive eDNA sampling results upstream of the electric fish dispersal barrier in the Chicago Area Waterway System (CAWS), where live SC are not expected to be present. I used a beta-binomial regression model, which showed that reverse flow volume across the barrier has a statistically significant positive relationship with the probability of SC eDNA detection upstream of the barrier from 2009 to 2012 while other covariates, such as water temperature, season, chlorophyll concentration, do not. This is a potential alternative explanation for why SC eDNA has been detected upstream of the barrier but intact SC have not. In chapter 3, I develop and parameterize a statistical model to evaluate how changes made to the US Fish and Wildlife Service (USFWS)’s eDNA sampling protocols for invasive BH and SC monitoring from 2013 to 2015 have influenced their sensitivity. The model shows that changes to the protocol have caused the sensitivity to fluctuate. Overall, when assuming that eDNA is randomly distributed, the sensitivity of the current protocol is higher for BH eDNA detection and similar for SC eDNA detection compared to the original protocol used from 2009-2012. When assuming that eDNA is clumped, the sensitivity of the current protocol is slightly higher for BH eDNA detection but worse for SC eDNA detection. In chapter 4, I apply the model developed in chapter 3 to estimate the BH and SC eDNA concentration distributions in two pools of the Illinois River where BH and SC are considered to be present, one pool where they are absent, and upstream of the electric barrier in the CAWS given eDNA sampling data and knowledge of the eDNA sampling protocol used in 2014. The results show that the estimated mean eDNA concentrations in the Illinois River are highest in the invaded pools (La Grange; Marseilles) and are lower in the uninvaded pool (Brandon Road). The estimated eDNA concentrations in the CAWS are much lower compared to the concentrations in the Marseilles pool, which indicates that the few eDNA detections in the CAWS (3% of samples positive for SC and 0.4% samples positive for BH) do not signal the presence of live BH or SC. The model shows that >50% samples positive for BH or SC eDNA are needed to infer AC presence in the CAWS, i.e., that the estimated concentrations are similar to what is found in the Marseilles pool. Finally, in chapter 5, I develop a decision tree model to evaluate the value of information that monitoring provides for making decisions about BH and SC prevention strategies near the Great Lakes. The optimal prevention strategy is dependent on prior beliefs about the expected damage of AC invasion, the probability of invasion, and whether or not BH and SC have already invaded the Great Lakes (which is informed by monitoring). Given no monitoring, the optimal strategy is to stay with the status quo of operating electric barriers in the CAWS for low probabilities of invasion and low expected invasion costs. However, if the probability of invasion is greater than 30% and the cost of invasion is greater than $100 million a year, the optimal strategy changes to installing an additional barrier in the Brandon Road pool. Greater risk-aversion (i.e., aversion to monetary losses) causes less prevention (e.g., status quo instead of additional barriers) to be preferred. Given monitoring, the model shows that monitoring provides value for making this decision, only if the monitoring tool has perfect specificity (false positive rate = 0%).

Page generated in 0.0507 seconds