Global ETD Search

71	Bayesian model estimation and comparison for longitudinal categorical data Tran, Thu Trung January 2008 (has links) In this thesis, we address issues of model estimation for longitudinal categorical data and of model selection for these data with missing covariates. Longitudinal survey data capture the responses of each subject repeatedly through time, allowing for the separation of variation in the measured variable of interest across time for one subject from the variation in that variable among all subjects. Questions concerning persistence, patterns of structure, interaction of events and stability of multivariate relationships can be answered through longitudinal data analysis. Longitudinal data require special statistical methods because they must take into account the correlation between observations recorded on one subject. A further complication in analysing longitudinal data is accounting for the non- response or drop-out process. Potentially, the missing values are correlated with variables under study and hence cannot be totally excluded. Firstly, we investigate a Bayesian hierarchical model for the analysis of categorical longitudinal data from the Longitudinal Survey of Immigrants to Australia. Data for each subject is observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. Secondly, we examine the Bayesian model selection techniques of the Bayes factor and Deviance Information Criterion for our regression models with miss- ing covariates. Computing Bayes factors involve computing the often complex marginal likelihood p(y\|model) and various authors have presented methods to estimate this quantity. Here, we take the approach of path sampling via power posteriors (Friel and Pettitt, 2006). The appeal of this method is that for hierarchical regression models with missing covariates, a common occurrence in longitudinal data analysis, it is straightforward to calculate and interpret since integration over all parameters, including the imputed missing covariates and the random effects, is carried out automatically with minimal added complexi- ties of modelling or computation. We apply this technique to compare models for the employment status of immigrants to Australia. Finally, we also develop a model choice criterion based on the Deviance In- formation Criterion (DIC), similar to Celeux et al. (2006), but which is suitable for use with generalized linear models (GLMs) when covariates are missing at random. We define three different DICs: the marginal, where the missing data are averaged out of the likelihood; the complete, where the joint likelihood for response and covariates is considered; and the naive, where the likelihood is found assuming the missing values are parameters. These three versions have different computational complexities. We investigate through simulation the performance of these three different DICs for GLMs consisting of normally, binomially and multinomially distributed data with missing covariates having a normal distribution. We find that the marginal DIC and the estimate of the effective number of parameters, pD, have desirable properties appropriately indicating the true model for the response under differing amounts of missingness of the covariates. We find that the complete DIC is inappropriate generally in this context as it is extremely sensitive to the degree of missingness of the covariate model. Our new methodology is illustrated by analysing the results of a community survey.
72	Life-history decisions of larids in spatio-temporally varying habitats : where and when to breed / Décisions d'histoire de vie chez les laridés en habitats variables dans l'espace et le temps : où et quand se reproduire Acker, Paul 30 March 2017 (has links) Tout au long de leur vie, les individus sont confrontés à deux décisions qui ont des conséquences majeures sur leur succès reproducteur : où et quand se reproduire. Cette thèse étudie les mécanismes sous-jacents à ces décisions, à travers trois études basées sur des données de suivis individuels chez la mouette tridactyle (Rissa tridactyla) et le goéland railleur (Chroicocephalus genei). La première étude porte sur la dispersion chez la mouette tridactyle. La probabilité de quitter le site de reproduction est décomposée selon une structure hiérarchique des patchs d'habitat. Une hypothèse synthétique est exposée pour expliquer la stratégie de sélection de l'habitat en intégrant les coûts de la dispersion et l'utilisation de l'information sur la qualité de l'habitat. La seconde étude s'appuie sur un modèle de population intégré chez la mouette tridactyle pour estimer l'immigration, le recrutement, et la reproduction intermittente. Cette étude interroge la relation entre information sociale sur l'habitat et décision de se reproduire dans une population qui est située en bordure d'aire de répartition. La troisième étude porte sur le recrutement et la dispersion chez le goéland railleur, caractérisé par un fort nomadisme de reproduction. Des modèles de capture-recapture multi-évènements sont employés pour quantifier les variations liées à l'âge et au sexe. Ces exemples permettent d'aborder l'importance des contraintes imposées par la variabilité de l'habitat et la compétition intraspécifique dans le processus d'accès à la reproduction. / Throughout their lifetime, individuals face two decisions which have major consequences on the reproductive success: where and when to breed. This thesis explores the mechanisms underlying these decisions through three studies based on individual monitoring data in the black-legged kittiwake (Rissa tridactyla) and the slender-billed gull (Chroicocephalus genei). The first study addresses hypotheses on dispersal in the kittiwake. The probability of leaving the nest site is sequenced according to the hierarchical structure of habitat patches. A synthetic hypothesis that integrates the costs of dispersal and the use of information on habitat quality is suggested to explain the strategy of habitat selection. The second study uses a population integrated model in the kittiwake to estimate immigration, recruitment, and intermittent reproduction. This study investigates the relationships between social information on the habitat and the decision to breed in a population which is located at the edge of the species range. The third study focuses on recruitment and dispersal in the slender-billed gull which is characterized by a high degree of nomadic breeding. Multievent capture-recapture models are used to quantify sex- and age-dependent variations. These examples enable to address how important the constraints of habitat variability and intraspecific competition are in the process of obtaining a breeding position. Sélection de l'habitat Dispersion Histoire de vie Etude à long terme Modèle bayésien Habitat selection Dispersal Life history Long term study Bayesian model
73	Income Inequality and Economic Growth: A Meta-Analysis / Income Inequality and Economic Growth: A Meta-Analysis Posvyanskaya, Alexandra January 2018 (has links) The impact of inequality on economic growth has become a topic of broad and current interest. Multiple researches investigated the issue but the disparity of opinions and empirical results is huge. The present thesis revises the pri- mary literature through a meta-analytical approach applying Bayesian Model Averaging (BMA) estimation technique. We examine 562 estimates collected from 58 studies published between 1991 and 2015. I find the evidence of the publication bias presence in the literature. The authors of primary studies tend to report preferentially negative and significant estimates. The BMA results suggest that the effect of inequality on growth is not straightforward and is likely not linear. A single pattern for inequality/growth relationship is not fea- sible since the results vary across used income inequality measures, estimation methods and data structure and quality. JEL Classification D31, O10, C11, C82 Keywords meta-analysis, inequality, economic growth, Bayesian model averaging, publication bias Author's e-mail 23376990@fsv.cuni.cz Supervisor's e-mail zuzana.havrankova@fsv.cuni.cz
74	Bankruptcy prediction models in the Czech economy: New specification using Bayesian model averaging and logistic regression on the latest data / Bankruptcy prediction models in the Czech economy: New specification using Bayesian model averaging and logistic regression on the latest data Kolísko, Jiří January 2017 (has links) The main objective of our research was to develop a new bankruptcy prediction model for the Czech economy. For that purpose we used the logistic regression and 150,000 financial statements collected for the 2002-2016 period. We defined 41 explanatory variables (25 financial ratios and 16 dummy variables) and used Bayesian model averaging to select the best set of explanatory variables. The resulting model has been estimated for three prediction horizons: one, two, and three years before bankruptcy, so that we could assess the changes in the importance of explanatory variables and models' prediction accuracy. To deal with high skew in our dataset due to small number of bankrupt firms, we applied over- and under- sampling methods on the train sample (80% of data). These methods proved to enhance our classifier's accuracy for all specifications and periods. The accuracy of our models has been evaluated by Receiver operating characteristics curves, Sensitivity-Specificity curves, and Precision-Recall curves. In comparison with models examined on similar data, our model performed very well. In addition, we have selected the most powerful predictors for short- and long-term horizons, which is potentially of high relevance for practice. JEL Classification C11, C51, C53, G33, M21 Keywords Bankruptcy...
75	Tři eseje o finančním rozvoji / Three Essays on Financial Development Mareš, Jan January 2020 (has links) The dissertation is a compilation of three empirical papers on the effects of financial development. In the first paper, we examine finance's effect on long-term economic growth using Bayesian model averaging to address model uncertainty. Our global sample findings indicate that the efficiency of financial intermediation is robustly related to long-term growth. The second and third papers investigate the determinants of wealth and income inequality, capturing various economic, financial, political, institutional, and geographical factors. We reveal that finance plays a considerable role in shaping both distributions.
76	Combining Prior Information for the Prediction of Transcription Factor Binding Sites Benner, Philipp 21 June 2018 (has links) Despite the fact that each cell in an organism has the same genetic information, it is possible that cells fundamentally differ in their function. The molecular basis for the functional diversity of cells is governed by biochemical processes that regulate the expression of genes. Key to this regulatory process are proteins called transcription factors that recognize and bind specific DNA sequences of a few nucleotides. Here we tackle the problem of identifying the binding sites of a given transcription factor. The prediction of binding preferences from the structure of a transcription factor is still an unsolved problem. For that reason, binding sites are commonly identified by searching for overrepresented sites in a given collection of nucleotide sequences. Such sequences might be known regulatory regions of genes that are assumed to be coregulated, or they are obtained from so-called ChIP-seq experiments that identify approximately the sites that were bound by a given transcription factor. In both cases, the observed nucleotide sequences are much longer than the actual binding sites and computational tools are required to uncover the actual binding preferences of a factor. Aggravated by the fact that transcription factors recognize not only a single nucleotide sequence, the search for overrepresented patterns in a given collection of sequences has proven to be a challenging problem. Most computational methods merely relied on the given set of sequences, but additional information is required in order to make reliable predictions. Here, this information is obtained by looking at the evolution of nucleotide sequences. For that reason, each nucleotide sequence in the observed data is augmented by its orthologs, i.e. sequences from related species where the same transcription factor is present. By constructing multiple sequence alignments of the orthologous sequences it is possible to identify functional regions that are under selective pressure and therefore appear more conserved than others. The processing of the additional information exerted by ortholog sequences relies on a phylogenetic tree equipped with a nucleotide substitution model that not only carries information about the ancestry, but also about the expected similarity of functional sites. As a result, a Bayesian method for the identification of transcription factor binding sites is presented. The method relies on a phylogenetic tree that agrees with the assumptions of the nucleotide substitution process. Therefore, the problem of estimating phylogenetic trees is discussed first. The computation of point estimates relies on recent developments in Hadamard spaces. Second, the statistical model is presented that captures the enrichment and conservation of binding sites and other functional regions in the observed data. The performance of the method is evaluated on ChIP-seq data of transcription factors, where the binding preferences have been estimated in previous studies. info:eu-repo/classification/ddc/500 ddc:500
77	Ekonomická nerovnost a percepce štěstí: Meta-analýza / Income Inequality and Happiness: A Meta-Analysis Kamenická, Lucie January 2021 (has links) The relationship between income inequality and happiness is central to a host of welfare policies. If higher income inequality puts people down, advocating for income redistribution from the rich to the poor could make society happier. We show, however, that this popular consensus on the relationship's direction is rather absent in the academic literature. Based on the 868 observations col- lected from 53 studies and controlling for 62 aspects of study design, we use state-of-the-art meta-analysis techniques to identify several important drivers of the efect. Unless each study gets the same weight, the literature is driven by publication bias pushing the estimates against the popular consensus. While geographical diferences dominate among the systematic infuences of the re- lationship's magnitude, the relationship is also strongly afected by various methods and data the authors use in the primary studies. Most prominently, it matters if authors control for diferent individual's characteristics, such as perceived trust in people or their health status.
78	Probabilistic Analysis of Contracting Ebola Virus Using Contextual Intelligence Gopalakrishnan, Arjun 05 1900 (has links) The outbreak of the Ebola virus was declared a Public Health Emergency of International Concern by the World Health Organisation (WHO). Due to the complex nature of the outbreak, the Centers for Disease Control and Prevention (CDC) had created interim guidance for monitoring people potentially exposed to Ebola and for evaluating their intended travel and restricting the movements of carriers when needed. Tools to evaluate the risk of individuals and groups of individuals contracting the disease could mitigate the growing anxiety and fear. The goal is to understand and analyze the nature of risk an individual would face when he/she comes in contact with a carrier. This thesis presents a tool that makes use of contextual data intelligence to predict the risk factor of individuals who come in contact with the carrier. Ebola Idid App Contextual Intelligence Risk Bayesian model Ebola virus disease -- Risk factors. Ebola virus disease -- Transmission.
79	Bayesian Model Selections for Log-binomial Regression Zhou, Wei January 2018 (has links) No description available. Statistics Log-binomial Regression Bayesian Model Selection Bayesian Variable Selection Monte Carlo methods Bayes factor Relative Risk
80	Skill Evaluation in Women's Volleyball Florence, Lindsay Walker 11 March 2008 (has links) (PDF) The Brigham Young University Women's Volleyball Team recorded and rated all skills (pass, set, attack, etc.) and recorded rally outcomes (point for BYU, rally continues, point for opponent) for the entire 2006 home volleyball season. Only sequences of events occurring on BYU's side of the net were considered. Events followed one of these general patterns: serve-outcome, pass-set-attack-outcome, or block-dig-set-attack-outcome. These sequences of events were assumed to be first-order Markov chains where the quality of each contact depended only explicitly on the quality of the previous contact but not on contacts further removed in the sequence. We represented these sequences in an extensive matrix of transition probabilities where the elements of the matrix were the probabilities of moving from one state to another. The count matrix consisted of the number of times play moved from one transition state to another during the season. Data in the count matrix were assumed to have a multinomial distribution. A Dirichlet prior was formulated for each row of the count matrix, so posterior estimates of the transition probabilities were then available using Gibbs sampling. The different paths in the transition probability matrix were followed through the possible sequences of events at each step of the MCMC process to compute the posterior probability density that a perfect pass results in a point, a perfect set results in a point, and so forth. These posterior probability densities are used to address questions about skill performance in BYU women's volleyball. volleyball Markov chain transition matrix Markov chain Monte Carlo Gibbs sampling multinomial distribution Bayesian model Statistics and Probability

Search results