• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 2
  • 1
  • Tagged with
  • 11
  • 11
  • 9
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bayesian Model Selection for High-dimensional High-throughput Data

Joshi, Adarsh 2010 May 1900 (has links)
Bayesian methods are often criticized on the grounds of subjectivity. Furthermore, misspecified priors can have a deleterious effect on Bayesian inference. Noting that model selection is effectively a test of many hypotheses, Dr. Valen E. Johnson sought to eliminate the need of prior specification by computing Bayes' factors from frequentist test statistics. In his pioneering work that was published in the year 2005, Dr. Johnson proposed using so-called local priors for computing Bayes? factors from test statistics. Dr. Johnson and Dr. Jianhua Hu used Bayes' factors for model selection in a linear model setting. In an independent work, Dr. Johnson and another colleage, David Rossell, investigated two families of non-local priors for testing the regression parameter in a linear model setting. These non-local priors enable greater separation between the theories of null and alternative hypotheses. In this dissertation, I extend model selection based on Bayes' factors and use nonlocal priors to define Bayes' factors based on test statistics. With these priors, I have been able to reduce the problem of prior specification to setting to just one scaling parameter. That scaling parameter can be easily set, for example, on the basis of frequentist operating characteristics of the corresponding Bayes' factors. Furthermore, the loss of information by basing a Bayes' factors on a test statistic is minimal. Along with Dr. Johnson and Dr. Hu, I used the Bayes' factors based on the likelihood ratio statistic to develop a method for clustering gene expression data. This method has performed well in both simulated examples and real datasets. An outline of that work is also included in this dissertation. Further, I extend the clustering model to a subclass of the decomposable graphical model class, which is more appropriate for genotype data sets, such as single-nucleotide polymorphism (SNP) data. Efficient FORTRAN programming has enabled me to apply the methodology to hundreds of nodes. For problems that produce computationally harder probability landscapes, I propose a modification of the Markov chain Monte Carlo algorithm to extract information regarding the important network structures in the data. This modified algorithm performs well in inferring complex network structures. I use this method to develop a prediction model for disease based on SNP data. My method performs well in cross-validation studies.
2

Bayesian Model Discrimination and Bayes Factors for Normal Linear State Space Models

Frühwirth-Schnatter, Sylvia January 1993 (has links) (PDF)
It is suggested to discriminate between different state space models for a given time series by means of a Bayesian approach which chooses the model that minimizes the expected loss. Practical implementation of this procedures requires a fully Bayesian analysis for both the state vector and the unknown hyperparameters which is carried out by Markov chain Monte Carlo methods. Application to some non-standard situations such as testing hypotheses on the boundary of the parameter space, discriminating non-nested models and discrimination of more than two models is discussed in detail. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
3

Calibrated Bayes factors for model selection and model averaging

Lu, Pingbo 24 August 2012 (has links)
No description available.
4

Dynamic network data envelopment analysis with a sequential structure and behavioural-causal analysis: Application to the Chinese banking industry

Fukuyama, H., Tsionas, M., Tan, Yong 24 March 2023 (has links)
Yes / The current study contributes to the literature in efficiency analysis in two ways: 1) we build on the existing studies in Dynamic Network Data Envelopment Analysis (DNDEA) by proposing a sequential structure incorporating dual-role characteristics of the production factors; 2) we initiate the efforts to complement the proposal of our innovative sequential DNDEA through a behavioural-causal analysis. The proposal of this statistical analysis is very important considering it does not only validate the proposal of the efficiency analysis but also our practice can be generalized to the future studies dealing with designing innovative production process. Finally, we apply these two different analyses to the banking industry. Using a sample of 43 Chinese commercial banks including five different ownership types (state-owned, joint-stock, city, rural, and foreign banks) between 2010 and 2018, we find that the inefficiency level is around 0.14, although slight volatility has been observed. We find that the highest efficiency is dominated by state-owned banks, and although foreign banks are less efficient than joint-stock banks, they are more efficient than city banks. Finally, we find that rural banks have the highest inefficiency.
5

Bayesian Phylogenetics and the Evolution of Gall Wasps

Nylander, Johan A. A. January 2004 (has links)
This thesis concerns the phylogenetic relationships and the evolution of the gall-inducing wasps belonging to the family Cynipidae. Several previous studies have used morphological data to reconstruct the evolution of the family. DNA sequences from several mitochondrial and nuclear genes where obtained and the first molecular, and combined molecular and morphological, analyses of higher-level relationships in the Cynipidae is presented. A Bayesian approach to data analysis is adopted, and models allowing combined analysis of heterogeneous data, such as multiple DNA data sets and morphology, are developed. The performance of these models is evaluated using methods that allow the estimation of posterior model probabilities, thus allowing selection of most probable models for the use in phylogenetics. The use of Bayesian model averaging in phylogenetics, as opposed to model selection, is also discussed. It is shown that Bayesian MCMC analysis deals efficiently with complex models and that morphology can influence combined-data analyses, despite being outnumbered by DNA data. This emphasizes the utility and potential importance of using morphological data in statistical analyses of phylogeny. The DNA-based and combined-data analyses of cynipid relationships differ from previous studies in two important respects. First, it was previously believed that there was a monophyletic clade of woody rosid gallers but the new results place the non-oak gallers in this assemblage (tribes Pediaspidini, Diplolepidini, and Eschatocerini) outside the rest of the Cynipidae. Second, earlier studies have lent strong support to the monophyly of the inquilines (tribe Synergini), gall wasps that develop inside the galls of other species. The new analyses suggest that the inquilines either originated several times independently, or that some inquilines secondarily regained the ability to induce galls. Possible reasons for the incongruence between morphological and DNA data is discussed in terms of heterogeneity in evolutionary rates among lineages, and convergent evolution of morphological characters.
6

Recursive Residuals and Model Diagnostics for Normal and Non-Normal State Space Models

Frühwirth-Schnatter, Sylvia January 1994 (has links) (PDF)
Model diagnostics for normal and non-normal state space models is based on recursive residuals which are defined from the one-step ahead predictive distribution. Routine calculation of these residuals is discussed in detail. Various tools of diagnostics are suggested to check e.g. for wrong observation distributions and for autocorrelation. The paper also covers such topics as model diagnostics for discrete time series, model diagnostics for generalized linear models, and model discrimination via Bayes factors. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
7

Bayesian model estimation and comparison for longitudinal categorical data

Tran, Thu Trung January 2008 (has links)
In this thesis, we address issues of model estimation for longitudinal categorical data and of model selection for these data with missing covariates. Longitudinal survey data capture the responses of each subject repeatedly through time, allowing for the separation of variation in the measured variable of interest across time for one subject from the variation in that variable among all subjects. Questions concerning persistence, patterns of structure, interaction of events and stability of multivariate relationships can be answered through longitudinal data analysis. Longitudinal data require special statistical methods because they must take into account the correlation between observations recorded on one subject. A further complication in analysing longitudinal data is accounting for the non- response or drop-out process. Potentially, the missing values are correlated with variables under study and hence cannot be totally excluded. Firstly, we investigate a Bayesian hierarchical model for the analysis of categorical longitudinal data from the Longitudinal Survey of Immigrants to Australia. Data for each subject is observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. Secondly, we examine the Bayesian model selection techniques of the Bayes factor and Deviance Information Criterion for our regression models with miss- ing covariates. Computing Bayes factors involve computing the often complex marginal likelihood p(y|model) and various authors have presented methods to estimate this quantity. Here, we take the approach of path sampling via power posteriors (Friel and Pettitt, 2006). The appeal of this method is that for hierarchical regression models with missing covariates, a common occurrence in longitudinal data analysis, it is straightforward to calculate and interpret since integration over all parameters, including the imputed missing covariates and the random effects, is carried out automatically with minimal added complexi- ties of modelling or computation. We apply this technique to compare models for the employment status of immigrants to Australia. Finally, we also develop a model choice criterion based on the Deviance In- formation Criterion (DIC), similar to Celeux et al. (2006), but which is suitable for use with generalized linear models (GLMs) when covariates are missing at random. We define three different DICs: the marginal, where the missing data are averaged out of the likelihood; the complete, where the joint likelihood for response and covariates is considered; and the naive, where the likelihood is found assuming the missing values are parameters. These three versions have different computational complexities. We investigate through simulation the performance of these three different DICs for GLMs consisting of normally, binomially and multinomially distributed data with missing covariates having a normal distribution. We find that the marginal DIC and the estimate of the effective number of parameters, pD, have desirable properties appropriately indicating the true model for the response under differing amounts of missingness of the covariates. We find that the complete DIC is inappropriate generally in this context as it is extremely sensitive to the degree of missingness of the covariate model. Our new methodology is illustrated by analysing the results of a community survey.
8

Bayesian Methods in Gaussian Graphical Models

Mitsakakis, Nikolaos 31 August 2010 (has links)
This thesis contributes to the field of Gaussian Graphical Models by exploring either numerically or theoretically various topics of Bayesian Methods in Gaussian Graphical Models and by providing a number of interesting results, the further exploration of which would be promising, pointing to numerous future research directions. Gaussian Graphical Models are statistical methods for the investigation and representation of interdependencies between components of continuous random vectors. This thesis aims to investigate some issues related to the application of Bayesian methods for Gaussian Graphical Models. We adopt the popular $G$-Wishart conjugate prior $W_G(\delta,D)$ for the precision matrix. We propose an efficient sampling method for the $G$-Wishart distribution based on the Metropolis Hastings algorithm and show its validity through a number of numerical experiments. We show that this method can be easily used to estimate the Deviance Information Criterion, providing a computationally inexpensive approach for model selection. In addition, we look at the marginal likelihood of a graphical model given a set of data. This is proportional to the ratio of the posterior over the prior normalizing constant. We explore methods for the estimation of this ratio, focusing primarily on applying the Monte Carlo simulation method of path sampling. We also explore numerically the effect of the completion of the incomplete matrix $D^{\mathcal{V}}$, hyperparameter of the $G$-Wishart distribution, for the estimation of the normalizing constant. We also derive a series of exact and approximate expressions for the Bayes Factor between two graphs that differ by one edge. A new theoretical result regarding the limit of the normalizing constant multiplied by the hyperparameter $\delta$ is given and its implications to the validity of an improper prior and of the subsequent Bayes Factor are discussed.
9

Bayesian Methods in Gaussian Graphical Models

Mitsakakis, Nikolaos 31 August 2010 (has links)
This thesis contributes to the field of Gaussian Graphical Models by exploring either numerically or theoretically various topics of Bayesian Methods in Gaussian Graphical Models and by providing a number of interesting results, the further exploration of which would be promising, pointing to numerous future research directions. Gaussian Graphical Models are statistical methods for the investigation and representation of interdependencies between components of continuous random vectors. This thesis aims to investigate some issues related to the application of Bayesian methods for Gaussian Graphical Models. We adopt the popular $G$-Wishart conjugate prior $W_G(\delta,D)$ for the precision matrix. We propose an efficient sampling method for the $G$-Wishart distribution based on the Metropolis Hastings algorithm and show its validity through a number of numerical experiments. We show that this method can be easily used to estimate the Deviance Information Criterion, providing a computationally inexpensive approach for model selection. In addition, we look at the marginal likelihood of a graphical model given a set of data. This is proportional to the ratio of the posterior over the prior normalizing constant. We explore methods for the estimation of this ratio, focusing primarily on applying the Monte Carlo simulation method of path sampling. We also explore numerically the effect of the completion of the incomplete matrix $D^{\mathcal{V}}$, hyperparameter of the $G$-Wishart distribution, for the estimation of the normalizing constant. We also derive a series of exact and approximate expressions for the Bayes Factor between two graphs that differ by one edge. A new theoretical result regarding the limit of the normalizing constant multiplied by the hyperparameter $\delta$ is given and its implications to the validity of an improper prior and of the subsequent Bayes Factor are discussed.
10

Phylogenetic Relationships of Silene sect. Melandrium and Allied Taxa (Caryophyllaceae), as Deduced from Multiple Gene Trees

Rautenberg, Anja January 2009 (has links)
This thesis focuses on phylogenetic relationships among some of the major lineages in Silene subgenus Behenantha (Caryophyllaceae) using DNA sequences from multiple, potentially unlinked gene regions from a large taxonomic and geographic sample. Both traditional phylogenetic analyses and a strategy to infer species trees and gene trees in a joint approach are used. A new strategy to optimize species classifications, based on the likelihoods of the observed gene trees, is presented. Silene latifolia, S. dioica and the other dioecious species previously classified in section Elisanthe are not closely related to the type of the section (S. noctiflora). The correct name for the group of dioecious species is section Melandrium. The chloroplast DNA data presented indicate a geographic, rather than a taxonomic, structure in section Melandrium. The nuclear genes investigated correlate more to the current taxonomy, although hybridization has likely been influencing the relationships within section Melandrium. Incongruence between different parts of the gene SlXY1 in two Silene lineages is investigated, using phylogenetic methods and a novel probabilistic, multiple primer-pair PCR approach. The incongruence is best explained by ancient hybridization and recombination events. A survey of mitochondrial substitution rate variation in Sileneae is presented. Silene section Conoimorpha, S. noctiflora and the closely related S. turkestanica have elevated synonymous substitution rates in the mitochondrial genes investigated. Morphological and phylogenetic data reject that the Californian S. multinervia should be treated as a synonym to the Asian S. coniflora, as has previously been suggested. Furthermore, none of the genes investigated, or a chromosome count, support the inclusion of S. multinervia in section Conoimorpha. Data from multiple genes suggest that S. noctiflora and S. turkestanica form a sister group to section Conoimorpha. The calyx nervature, which is a potential synapomorphy for S. multinervia and section Conoimorpha, may be explained either by parallelism or by sorting effects.

Page generated in 0.0356 seconds