Spelling suggestions: "subject:"none bayesian."" "subject:"none eayesian.""
81 |
Bayes linear covariance matrix adjustmentWilkinson, Darren James January 1995 (has links)
In this thesis, a Bayes linear methodology for the adjustment of covariance matrices is presented and discussed. A geometric framework for quantifying uncertainties about covariance matrices is set up, and an inner-product for spaces of random matrices is motivated and constructed. The inner-product on this space captures aspects of belief about the relationships between covariance matrices of interest, providing a structure rich enough to adjust beliefs about unknown matrices in the light of data such as sample covariance matrices, exploiting second-order exchangeability and related specifications to obtain representations allowing analysis. Adjustment is associated with orthogonal projection, and illustrated by examples for some common problems. The difficulties of adjusting the covariance matrices underlying exchangeable random vectors is tackled and discussed. Learning about the covariance matrices associated with multivariate time series dynamic linear models is shown to be amenable to a similar approach. Diagnostics for matrix adjustments are also discussed.
|
82 |
RVD2: An ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing dataHe, Yuting 29 April 2014 (has links)
Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Unlike research cell lines, clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed. Result: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele frequency. We apply our model and identify twelve mutations in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are loss-of-heterozygosity events.
|
83 |
A path-specific approach to SEIR modelingPorter, Aaron Thomas 01 May 2012 (has links)
Despite being developed in the late 1920s, compartmental epidemic modeling is still a rich and fruitful area of research. The original compartmental epidemic models were SIR (Susceptible, Infectious, Removed) models, which assume permanent immunity after recovery. SIR models, along with the more recent SEIR (Susceptible, Exposed, Infectious, Removed) models are still the gold standard in modeling pathogens that confer permanent immunity. This dissertation expands the SEIR structure to include a new class of spatial SEIR models. The exponential assumption of these models states that the latent and infectious times of the pathogen are exponentially distributed. Work that relaxes this assumption and still allows for mixing to occur at the population level is limited, thereby making strong assumptions about these times. We relax this assumption in a flexible way, by considering a hybrid approach that contains characteristics of both population level and individual level approaches. Next, we expand the Conditional Autoregressive (CAR) class of spatial models. This is to account for the Mumps data set we have procured, which contains mismatched lattice structures that cannot be handled by traditional CAR models. The use of CAR models is desirable here, as these models are known to produce spatial smoothing on lattices, and are a natural way to draw strength spatially in estimating spatial effects. Finally, we develop a pair of spatial SEIR models utilizing our CAR structure. The first utilizes the exponential assumption, which is very robust. The second develops a highly flexible spatial SEIR model by embedding the CAR structure into the SEIR structure. This allows for a realistic analysis of epidemic data occurring on a lattice. These models are applied to the Iowa Mumps epidemic of 2006. There are three questions of interest. First, what improvement do the methods proposed here provide over the current models in the literature? Second, did spring break, which occurred approximately 40 days into the epidemic, have an effect on the overall number of new infections? Thirdly, did the public's awareness of the epidemic change the rate at which mixing occurred over time? The spatial models in this dissertation are adequately constructed to answer these questions, and the results are provided.
|
84 |
Bayesian analysis of rainfall-runoff models: insights to parameter estimation, model comparison and hierarchical model developmentMarshall, Lucy Amanda, Civil & Environmental Engineering, Faculty of Engineering, UNSW January 2006 (has links)
One challenge that faces hydrologists in water resources planning is to predict the catchment???s response to a given rainfall. Estimation of parameter uncertainty (and model uncertainty) allows assessment of the risk in likely applications of hydrological models. Bayesian statistical inference, with computations carried out via Markov Chain Monte Carlo (MCMC) methods, offers an attractive approach to model specification, allowing for the combination of any pre-existing knowledge about individual models and their respective parameters with the available catchment data to assess both parameter and model uncertainty. This thesis develops and applies Bayesian statistical tools for parameter estimation, comparison of model performance and hierarchical model aggregation. The work presented has three main sections. The first area of research compares four MCMC algorithms for simplicity, ease of use, efficiency and speed of implementation in the context of conceptual rainfall-runoff modelling. Included is an adaptive Metropolis algorithm that has characteristics that are well suited to hydrological applications. The utility of the proposed adaptive algorithm is further expanded by the second area of research in which a probabilistic regime for comparing selected models is developed and applied. The final area of research introduces a methodology for hydrologic model aggregation that is flexible and dynamic. Rigidity in the model structure limits representation of the variability in the flow generation mechanism, which becomes a limitation when the flow processes are not clearly understood. The proposed Hierarchical Mixtures of Experts (HME) model architecture is designed to do away with this limitation by selecting individual models probabilistically based on predefined catchment indicators. In addition, the approach allows a more flexible specification of the model error to better assess the risk of likely outcomes based on the model simulations. Application of the approach to lumped and distributed rainfall runoff models for a variety of catchments shows that by assessing different catchment predictors the method can be a useful tool for prediction of catchment response.
|
85 |
Analysis of Bayesian anytime inference algorithmsBurgess, Scott Alan 31 August 2001 (has links)
This dissertation explores and analyzes the performance of several Bayesian
anytime inference algorithms for dynamic influence diagrams. These algorithms are
compared on the On-Line Maintenance Agent testbed, a software artifact permitting
comparison of dynamic reasoning algorithms used by an agent on a variety of simulated
maintenance and monitoring tasks. Analysis of their performance suggests that a
particular algorithmic property, which I term sampling kurtosis, may be responsible for
successful reasoning in the tested half-adder domain. A new algorithm is devised and
evaluated which permits testing of sampling kurtosis, revealing that it may not be the
most significant algorithm property but suggesting new lines of inquiry. Peculiarities in
the observed data lead to a detailed analysis of agent-simulator interaction, resulting in an
equation model and a Stochastic Automata Network model for a random action
algorithm. The model analyses are extended to show that some of the anytime reasoning
algorithms perform remarkably near optimally. The research suggests improvements for
the design and development of reasoning testbeds. / Graduation date: 2002
|
86 |
Modeling the NCAA Tournament Through Bayesian Logistic RegressionNelson, Bryan 18 July 2012 (has links)
Many rating systems exist that order the Division I teams in Men's College Basketball that compete in the NCAA Tournament, such as seeding teams on an S-curve, and the Pomeroy and Sagarin ratings, simplifying the process of choosing winners to a comparison of two numbers. Rather than creating a rating system, we analyze each matchup by using the difference between the teams' individual regular season statistics as the independent variables. We use an MCMC approach and logistic regression along with several model selection techniques to arrive at models for predicting the winner of each game. When given the 63 actual games in the 2012 tournament, eight of our models performed as well as Pomeroy's rating system and four did as well as Sagarin's rating system when given the 63 actual games. Not allowing the models to fix their mistakes resulted in only one model outperforming both Pomeroy and Sagarin's systems. / McAnulty College and Graduate School of Liberal Arts / Computational Mathematics / MS / Thesis
|
87 |
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabilistic MethodsAbeyruwan, Saminda Wishwajith 01 January 2010 (has links)
An ontology is a formal, explicit specification of a shared conceptualization. Formalizing an ontology for a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck (KAB). There exists a large number of text corpora that can be used for classification in order to create ontologies with the intention to provide better support for the intended parties. In our research we provide a novel unsupervised bottom-up ontology generation method. This method is based on lexico-semantic structures and Bayesian reasoning to expedite the ontology generation process. This process also provides evidence to domain experts to build ontologies based on top-down approaches.
|
88 |
Bayesian synthesisYu, Qingzhao. January 2006 (has links)
Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 126-130).
|
89 |
Reconstructing posterior distributions of a species phylogeny using estimated gene tree distributionsLiu, Liang. January 2006 (has links)
Thesis (Ph. D.)--Ohio State University, 2006. / Title from first page of PDF file. Includes bibliographical references (p. 94-103).
|
90 |
Bayesian Unsupervised Labeling of Web Document ClustersLiu, Ting 22 August 2011 (has links)
Information technologies have recently led to a surge of electronic documents in the form of emails, webpages, blogs, news articles, etc. To help users decide which documents may be interesting to read, it is common practice to organize documents by categories/topics. A wide range of supervised and unsupervised learning techniques already exist for automated text classification and text clustering. However, supervised learning requires a training set of documents already labeled with topics/categories, which is not always readily available. In contrast, unsupervised learning techniques do not require labeled documents, but assigning a suitable category to each resulting cluster remains a difficult problem. The state of the art consists of extracting keywords based on word frequency (or related heuristics).
In this thesis, we improve the extraction of keywords for unsupervised labeling of document clusters by designing a Bayesian approach based on topic modeling. More precisely, we describe an approach that uses a large side corpus to infer a language model that implicitly encodes the semantic relatedness of different words. This language model is then used to build a generative model of the cluster in such a way that the probability of generating each word depends on its frequency in the cluster as well as the frequency of its semantically related words. The words with the highest probability of generation are then extracted to label the cluster.
In this approach, the side corpus can be thought as a source of domain knowledge or context. However, there are two potential problems: processing a large side corpus can be time consuming and if the content of this corpus is not similar enough to the cluster, the resulting language model may be biased. We deal with those issues by designing a Bayesian transfer learning framework that allows us to process the side corpus just once offline and to weigh its importance based on the degree of similarity with the cluster.
|
Page generated in 0.046 seconds