21 |
A Bayesian Approach to Detect the Onset of Activity Limitation Among Adults in NHISBai, Yan 06 May 2005 (has links)
Data from the 1995 National Health Interview Survey (NHIS) indicate that, due to chronic conditions, the onset of activity limitation typically occurs between age 40-70 years (i.e., the proportion of young adults with activity limitation is small and roughly constant with age and then it starts to change, roughly increasing). We use a Bayesian hierarchical model to detect the change point of a positive activity limitation status (ALS) across twelve domains based on race, gender, and education. We have two types of data: weighted and unweighted. We obtain weighted binomial counts using a regression analysis with the sample weights. Given the proportion of individuals in the population with positive ALS, we assume that the number of individuals with positive ALS at each age group has a binomial probability mass function. The proportions across age are different, and have the same beta distribution up to the change point (unknown), and the proportions after the change point have a different beta distribution. We consider two different analyses. The first considers each domain individually in its own model and the second considers the twelve domains simultaneously in a single model to“borrow strength" as in small area estimation. It is reasonable to assume that each domain has its own onset.In the first analysis, we use the Gibbs sampler to fit the model, and a computation of the marginal likelihoods, using an output analysis from the Gibbs sampler, provides the posterior distribution of the change point. We note that a reversible jump sampler fails in this analysis because it tends to get stuck either age 40 or age 70. In the second analysis, we use the Gibbs sampler to fit only the joint posterior distribution of the twelve change points. This is a difficult problem because the joint density requires the numerical computation of a triple integral at each iteration. The other parameters of the process are obtained using data augmentation by a Metropolis sampler and a Rao-Blackwellization. We found that overall the age of onset is about 50 to 60 years.
|
22 |
Bayesian Semiparametric Models For Nonignorable Missing Datamechanisms In Logistic RegressionOzturk, Olcay 01 May 2011 (has links) (PDF)
In this thesis, Bayesian semiparametric models for the missing data mechanisms of nonignorably missing covariates in logistic regression are developed. In the missing data literature, fully parametric approach is used to model the nonignorable missing data mechanisms. In that approach, a probit or a logit link of the conditional probability of the covariate being missing is modeled as a linear combination of all variables including the missing covariate itself. However, nonignorably missing covariates may not be linearly related with the probit (or logit) of this conditional probability. In our study, the relationship between the probit of the probability of the covariate being missing and the missing covariate itself is modeled by using a penalized spline regression based semiparametric approach. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm to estimate the parameters is established. A WinBUGS code is constructed to sample from the full conditional posterior distributions of the parameters by using Gibbs sampling. Monte Carlo simulation experiments under different true missing data mechanisms are applied to compare the bias and efficiency properties of the resulting estimators with the ones from the fully parametric approach. These simulations show that estimators for logistic regression using semiparametric missing data models maintain better bias and efficiency properties than the ones using fully parametric missing data models when the true relationship between the missingness and the missing covariate has a nonlinear form. They are comparable when this relationship has a linear form.
|
23 |
Gibbs Sampling and Expectation Maximization Methods for Estimation of Censored Values from Correlated Multivariate DistributionsHUNTER, TINA D. 25 August 2008 (has links)
No description available.
|
24 |
Bayesian Cluster Analysis : Some Extensions to Non-standard SituationsFranzén, Jessica January 2008 (has links)
The Bayesian approach to cluster analysis is presented. We assume that all data stem from a finite mixture model, where each component corresponds to one cluster and is given by a multivariate normal distribution with unknown mean and variance. The method produces posterior distributions of all cluster parameters and proportions as well as associated cluster probabilities for all objects. We extend this method in several directions to some common but non-standard situations. The first extension covers the case with a few deviant observations not belonging to one of the normal clusters. An extra component/cluster is created for them, which has a larger variance or a different distribution, e.g. is uniform over the whole range. The second extension is clustering of longitudinal data. All units are clustered at all time points separately and the movements between time points are modeled by Markov transition matrices. This means that the clustering at one time point will be affected by what happens at the neighbouring time points. The third extension handles datasets with missing data, e.g. item non-response. We impute the missing values iteratively in an extra step of the Gibbs sampler estimation algorithm. The Bayesian inference of mixture models has many advantages over the classical approach. However, it is not without computational difficulties. A software package, written in Matlab for Bayesian inference of mixture models is introduced. The programs of the package handle the basic cases of clustering data that are assumed to arise from mixture models of multivariate normal distributions, as well as the non-standard situations.
|
25 |
Optimisation spatio-temporelle d’efforts de recherche pour cibles manoeuvrantes et intelligentes / Spatio-temporal optimisation of search efforts for smart and reactive moving targetsChouchane, Mathieu 17 October 2013 (has links)
Dans cette thèse, nous cherchons à répondre à une problématique formulée par la DGA Techniques navales pour surveiller une zone stratégique : planifier le déploiement spatial et temporel optimal d’un ensemble de capteurs de façon à maximiser les chances de détecter une cible mobile et intelligente. La cible est dite intelligente car elle est capable de détecter sous certaines conditions les menaces que représentent les capteurs et ainsi de réagir en adaptant son comportement. Les déploiements générés pouvant aussi avoir un coût élevé nous devons tenir compte de ce critère lorsque nous résolvons notre problématique. Il est important de noter que la résolution d’un problème de ce type requiert, selon les besoins, l’application d’une méthode d’optimisation mono-objectif voire multiobjectif. Jusqu’à présent, les travaux existants n’abordent pas la question du coût des déploiements proposés. De plus la plupart d’entre eux ne se concentrent que sur un seul aspect à la fois. Enfin, pour des raisons algorithmiques, les contraintes sont généralement discrétisées.Dans une première partie, nous présentons un algorithme qui permet de déterminer le déploiement spatio-temporel de capteurs le plus efficace sans tenir compte de son coût. Cette méthode est une application à l’optimisation de la méthode multiniveau généralisée.Dans la seconde partie, nous montrons d’abord que l’utilisation de la somme pondérée des deux critères permet d’obtenir des solutions sans augmenter le temps de calcul. Pour notre seconde approche, nous nous inspirons des algorithmes évolutionnaires d’optimisation multiobjectif et adaptons la méthode multiniveau généralisée à l’optimisation multiobjectif. / In this work, we propose a solution to a problem issued by the DGA Techniques navales in order to survey a strategic area: determining the optimal spatio-temporal deployment of sensors that will maximize the detection probability of a mobile and smart target. The target is said to be smart because it is capable of detecting the threat of the sensors under certain conditions and then of adapting its behaviour to avoid it. The cost of a deployment is known to be very expensive and therefore it has to be taken into account. It is important to note that the wide spectrum of applications within this field of research also reflects the need for a highly complex theoretical framework based on stochastic mono or multi-objective optimisation. Until now, none of the existing works have dealt with the cost of the deployments. Moreover, the majority only treat one type of constraint at a time. Current works mostly rely on operational research algorithms which commonly model the constraints in both discrete space and time.In the first part, we present an algorithm which computes the most efficient spatio-temporal deployment of sensors, but without taking its cost into account. This optimisation method is based on an application of the generalised splitting method.In the second part, we first use a linear combination of the two criteria. For our second approach, we use the evolutionary multiobjective optimisation framework to adapt the generalised splitting method to multiobjective optimisation. Finally, we compare our results with the results of the NSGA-II algorithm.
|
26 |
Modelling Long-Term Persistence in Hydrological Time SeriesThyer, Mark Andrew January 2001 (has links)
The hidden state Markov (HSM) model is introduced as a new conceptual framework for modelling long-term persistence in hydrological time series. Unlike the stochastic models currently used, the conceptual basis of the HSM model can be related to the physical processes that influence long-term hydrological time series in the Australian climatic regime. A Bayesian approach was used for model calibration. This enabled rigourous evaluation of parameter uncertainty, which proved crucial for the interpretation of the results. Applying the single site HSM model to rainfall data from selected Australian capital cities provided some revealing insights. In eastern Australia, where there is a significant influence from the tropical Pacific weather systems, the results showed a weak wet and medium dry state persistence was likely to exist. In southern Australia the results were inconclusive. However, they suggested a weak wet and strong dry persistence structure may exist, possibly due to the infrequent incursion of tropical weather systems in southern Australia. This led to the postulate that the tropical weather systems are the primary cause of two-state long-term persistence. The single and multi-site HSM model results for the Warragamba catchment rainfall data supported this hypothesis. A strong two-state persistence structure was likely to exist in the rainfall regime of this important water supply catchment. In contrast, the single and multi-site results for the Williams River catchment rainfall data were inconsistent. This illustrates further work is required to understand the application of the HSM model. Comparisons with the lag-one autoregressive [AR(1)] model showed that it was not able to reproduce the same long-term persistence as the HSM model. However, with record lengths typical of real data the difference between the two approaches was not statistically significant. Nevertheless, it was concluded that the HSM model provides a conceptually richer framework than the AR(1) model. / PhD Doctorate
|
27 |
Bayesian Cluster Analysis : Some Extensions to Non-standard SituationsFranzén, Jessica January 2008 (has links)
<p>The Bayesian approach to cluster analysis is presented. We assume that all data stem from a finite mixture model, where each component corresponds to one cluster and is given by a multivariate normal distribution with unknown mean and variance. The method produces posterior distributions of all cluster parameters and proportions as well as associated cluster probabilities for all objects. We extend this method in several directions to some common but non-standard situations. The first extension covers the case with a few deviant observations not belonging to one of the normal clusters. An extra component/cluster is created for them, which has a larger variance or a different distribution, e.g. is uniform over the whole range. The second extension is clustering of longitudinal data. All units are clustered at all time points separately and the movements between time points are modeled by Markov transition matrices. This means that the clustering at one time point will be affected by what happens at the neighbouring time points. The third extension handles datasets with missing data, e.g. item non-response. We impute the missing values iteratively in an extra step of the Gibbs sampler estimation algorithm. The Bayesian inference of mixture models has many advantages over the classical approach. However, it is not without computational difficulties. A software package, written in Matlab for Bayesian inference of mixture models is introduced. The programs of the package handle the basic cases of clustering data that are assumed to arise from mixture models of multivariate normal distributions, as well as the non-standard situations.</p>
|
28 |
分析失去部分訊息的貝氏更新計算方法 / Bayesian updating methods for the analysis of censored data.范靜宜, Fan, Gin-Yi Unknown Date (has links)
對於使用貝氏法來處理部份區分(partially-classified)或是失去部分訊息資料的類別抽樣(categorical sampling with censored data),大多建立在「誠實回答」(truthful reporting)以及「無價值性失去部分訊息」(non-informative censoring)的前提下。Jiang(1995)及Jiang and Dickey(2006)取消以上兩個限制,提出貝氏解並利用準貝氏法(quasi-Bayes)來求近似解,而Jiang and Ko(2004)也利用吉氏取樣器(Gibbs sampler)來近似這類問題的貝氏解。本文首先嘗試利用Kuroda, Geng and Niki(2001)所提的“平均變異數和(average variance sum)”估計法
來應用到我們問題的貝氏解。在小樣本時,數值上我們可求得貝氏解,因此本文另一個重點為在小樣本時比較以上三種方法估計值的準確性,並考慮先驗參數(prior)的選取對估計的影響。
本文更進一步證明若選取到某種特殊的先驗參數時,利用“平均變異數和”的方法所計算出來的結果會和
準貝氏法的估計結果相同,而且皆等於用貝氏法計算出的結果。
|
29 |
Analysis Of Stochastic And Non-stochastic Volatility ModelsOzkan, Pelin 01 September 2004 (has links) (PDF)
Changing in variance or volatility with time can be modeled as deterministic by using autoregressive conditional heteroscedastic (ARCH) type models, or as stochastic by using stochastic volatility (SV) models. This study compares these two kinds of models which are estimated on Turkish / USA exchange rate data. First, a GARCH(1,1) model is fitted to the data by using the package E-views and then a Bayesian estimation procedure is used for estimating an appropriate SV model with the help of Ox code. In order to compare these models, the LR test statistic calculated for non-nested hypotheses is obtained.
|
30 |
Qualidade das informações de parentesco na avaliação genética de bovinos de corte / Quality of relationship information on genetic evaluation of beef cattleJunqueira, Vinícius Silva 25 July 2014 (has links)
Made available in DSpace on 2015-03-26T13:42:34Z (GMT). No. of bitstreams: 1
texto completo.pdf: 9480742 bytes, checksum: f87c9b2653b911e326eef4b5a0dd87ce (MD5)
Previous issue date: 2014-07-25 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The aim of this study was to evaluate the quality of relationship information over genetic parameters estimates and accuracies of breeding values. Thus, we evaluated the quality by making corrections based on SNPs markers. Variance components were estimated under Bayesian approach via Gibbs sampling. The evaluation of the correct relationship was performed using markers of 3,591 individuals in a population consisting of 12,668 animals. Mendelian conflicts were deflned as conflicts between the progeny and parents markers. Thus, 460 changes were performed, which 54% had a buII or cow on pedigree. We observed, on average, match parent were defined with 75 markers, whiIe no- match parent n was performed with 2,700 markers. Annealing algorithm Molecular coancestry program provided 2,174 new haIf-sibs relationships. We observed that higher quality on pedigree information provided increase in accuracy of breeding values and higher heritability estimates (0.22 i 0.0286), suggesting the possibility of direct selection for tick resistance. We used a 5-fold cross-validation strategy using K-means and random methods to group the animals. The results of cross-validation indicate that higher quality of the pedigree relationships provide higher accuracy. The weaning weight was used to evaluate muItipIe sires progeny (MS) and embryo transfer and in vitro fertilization animals (TEF). We used three strategies to include MS information: genetic groups (GG), bayesian hierarchical method (HIER) and the average numerator relationship matrix (ANRM). The deviance information criteria (DIC) was used as Bayesian measure of fIt, which suggested that HIER strategy provided better fIt when MS information is consider. However, ANRM provided best fit to include TEF animals. We use foster dam information to estimate the maternal genetic and maternal permanent environmental effects when considering TEF animals. We didn't observed statistical difference in variance components and genetic parameters considering information TEF animals, but breeding values showed greater values. Spearman correlations between breeding values were higher for base, animals with certain paternity and MS progeny. However, the use of TEF animals information significantly changed the predicted breeding values. The use of appropriate methodologies to include the information of MS progeny and TEF animals provide most accurate prediction of breeding values and assist in higher rates of genetic gain. / O objetivo desse estudo foi verificar o reflexo da qualidade das informações de parentesco sobre as estimativas de parâmetros genéticos e acurácias das predições de valor genética Dessa forma, foi avaliada a qualidade dessas informações ao realizar correções no pedigree baseadas em marcadores do tipo SNPs. Os componentes de variância foram estimados sob enforque Bayesiano por amostragem de Gibbs. A avaliação da correta definição de parentesco foi realizada utilizando-se informações de marcadores SNPs de 3.591 indivíduos em uma população constituída de 12.668 animais. Os conflitos mendelianos entre as marcas da progênie e dos pais foram utilizados como critério de avaliação de parentesco. Dessa forma, foram realizadas 460 mudanças no pedigree, dentre os quais 54% possuíam um touro ou vaca identificado. Foi observado que, em média, novos relacionamentos genéticos foram definidos com 75 marcas em conflito, enquanto que a rejeição do parentesco foi realizada com 2.700 marcas. O uso do programa Molecular Coancestry para inferência de parentesco a partir de informações de marcadores moleculares proporcionou a definição de 2.174 novos relacionamentos de meio-irmãos. Foi possível observar que as correções de parentesco proporcionaram aumento da acurácia média dos valores genéticos preditos. O aumento na qualidade das informações de pedigree proporcionou a estimação de herdabilidade de maior magnitude (0,22 i 0,0286), sugerindo a possibilidade de seleção direta para a resistência a carrapatos. Foi utilizada uma estratégia de validação cruzada pelo método K-médias e também de forma aleatória, no qual cinco grupos de treinamento foram formados. Os resultados da validação cruzada indicam que maior qualidade nos relacionamentos do pedigree proporcionam maior valor de acurácia. O peso padronizado aos 205 dias foi utilizado para avaliar a inclusão de informações de filhos de reprodutores múltiplos (RM) e de animais oriundos de biotécnicas da reprodução (TEF). Foram avaliadas três estratégias de inclusão dessas informações: grupos genéticos (GG), método hierárquico bayesiano (HIER) e matriz da média dos numeradores dos coeficientes de parentesco (ANRM). O critério de informação da deviance (DIC) foi utilizado como avaliador da qualidade de ajuste e sugeriu que a estratégia HIER proporcionou melhor ajuste ao considerar as informações de RM. Entretanto, o método ANRM apresentou melhor ajuste ao incluir informações dos animais TEF. A inclusão das informações de animais TEF foi realizada pelo uso da informação da receptora para estimação do efeito genético materno e de ambiente permanente materno. Não foi observada diferença estatística nas estimativas de componentes de variância e de parâmetros genéticos ao considerar informações de animais TEF, porém os valores genéticos preditos apresentaram maior magnitude. As correlações de Spearman entre os valores genéticos foram de elevadas magnitudes para os animais fundadores, animais com certeza no parentesco e para os fiIhos de RM. Entretanto, o uso das informações dos animais TEF modificou de forma significativa as predições dos valores genéticos. O uso de metodologias adequadas para incluir a informação de animais com incerteza de paternidade e animais oriundos de transferência de embriões e fertilização in vitro proporcionam a predição de valores genéticos mais acurados e auxiliam em maiores taxas de ganho genético.
|
Page generated in 0.0393 seconds