• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 132
  • 55
  • 42
  • 15
  • 14
  • 8
  • 6
  • 4
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 322
  • 140
  • 119
  • 119
  • 69
  • 54
  • 44
  • 39
  • 27
  • 24
  • 22
  • 22
  • 21
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
201

Bayesian approaches of Markov models embedded in unbalanced panel data

Muller, Christoffel Joseph Brand 12 1900 (has links)
Thesis (PhD)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: Multi-state models are used in this dissertation to model panel data, also known as longitudinal or cross-sectional time-series data. These are data sets which include units that are observed across two or more points in time. These models have been used extensively in medical studies where the disease states of patients are recorded over time. A theoretical overview of the current multi-state Markov models when applied to panel data is presented and based on this theory, a simulation procedure is developed to generate panel data sets for given Markov models. Through the use of this procedure a simulation study is undertaken to investigate the properties of the standard likelihood approach when fitting Markov models and then to assess its shortcomings. One of the main shortcomings highlighted by the simulation study, is the unstable estimates obtained by the standard likelihood models, especially when fitted to small data sets. A Bayesian approach is introduced to develop multi-state models that can overcome these unstable estimates by incorporating prior knowledge into the modelling process. Two Bayesian techniques are developed and presented, and their properties are assessed through the use of extensive simulation studies. Firstly, Bayesian multi-state models are developed by specifying prior distributions for the transition rates, constructing a likelihood using standard Markov theory and then obtaining the posterior distributions of the transition rates. A selected few priors are used in these models. Secondly, Bayesian multi-state imputation techniques are presented that make use of suitable prior information to impute missing observations in the panel data sets. Once imputed, standard likelihood-based Markov models are fitted to the imputed data sets to estimate the transition rates. Two different Bayesian imputation techniques are presented. The first approach makes use of the Dirichlet distribution and imputes the unknown states at all time points with missing observations. The second approach uses a Dirichlet process to estimate the time at which a transition occurred between two known observations and then a state is imputed at that estimated transition time. The simulation studies show that these Bayesian methods resulted in more stable results, even when small samples are available. / AFRIKAANSE OPSOMMING: Meerstadium-modelle word in hierdie verhandeling gebruik om paneeldata, ook bekend as longitudinale of deursnee tydreeksdata, te modelleer. Hierdie is datastelle wat eenhede insluit wat oor twee of meer punte in tyd waargeneem word. Hierdie tipe modelle word dikwels in mediese studies gebruik indien verskillende stadiums van ’n siekte oor tyd waargeneem word. ’n Teoretiese oorsig van die huidige meerstadium Markov-modelle toegepas op paneeldata word gegee. Gebaseer op hierdie teorie word ’n simulasieprosedure ontwikkel om paneeldatastelle te simuleer vir gegewe Markov-modelle. Hierdie prosedure word dan gebruik in ’n simulasiestudie om die eienskappe van die standaard aanneemlikheidsbenadering tot die pas vanMarkov modelle te ondersoek en dan enige tekortkominge hieruit te beoordeel. Een van die hoof tekortkominge wat uitgewys word deur die simulasiestudie, is die onstabiele beramings wat verkry word indien dit gepas word op veral klein datastelle. ’n Bayes-benadering tot die modellering van meerstadiumpaneeldata word ontwikkel omhierdie onstabiliteit te oorkom deur a priori-inligting in die modelleringsproses te inkorporeer. Twee Bayes-tegnieke word ontwikkel en aangebied, en hulle eienskappe word ondersoek deur ’n omvattende simulasiestudie. Eerstens word Bayes-meerstadium-modelle ontwikkel deur a priori-verdelings vir die oorgangskoerse te spesifiseer en dan die aanneemlikheidsfunksie te konstrueer deur van standaard Markov-teorie gebruik te maak en die a posteriori-verdelings van die oorgangskoerse te bepaal. ’n Gekose aantal a priori-verdelings word gebruik in hierdie modelle. Tweedens word Bayesmeerstadium invul tegnieke voorgestel wat gebruik maak van a priori-inligting om ontbrekende waardes in die paneeldatastelle in te vul of te imputeer. Nadat die waardes ge-imputeer is, word standaard Markov-modelle gepas op die ge-imputeerde datastel om die oorgangskoerse te beraam. Twee verskillende Bayes-meerstadium imputasie tegnieke word bespreek. Die eerste tegniek maak gebruik van ’n Dirichletverdeling om die ontbrekende stadium te imputeer by alle tydspunte met ’n ontbrekende waarneming. Die tweede benadering gebruik ’n Dirichlet-proses om die oorgangstyd tussen twee waarnemings te beraam en dan die ontbrekende stadium te imputeer op daardie beraamde oorgangstyd. Die simulasiestudies toon dat die Bayes-metodes resultate oplewer wat meer stabiel is, selfs wanneer klein datastelle beskikbaar is.
202

Analysis of Binary Data via Spatial-Temporal Autologistic Regression Models

Wang, Zilong 01 January 2012 (has links)
Spatial-temporal autologistic models are useful models for binary data that are measured repeatedly over time on a spatial lattice. They can account for effects of potential covariates and spatial-temporal statistical dependence among the data. However, the traditional parametrization of spatial-temporal autologistic model presents difficulties in interpreting model parameters across varying levels of statistical dependence, where its non-negative autocovariates could bias the realizations toward 1. In order to achieve interpretable parameters, a centered spatial-temporal autologistic regression model has been developed. Two efficient statistical inference approaches, expectation-maximization pseudo-likelihood approach (EMPL) and Monte Carlo expectation-maximization likelihood approach (MCEML), have been proposed. Also, Bayesian inference is considered and studied. Moreover, the performance and efficiency of these three inference approaches across various sizes of sampling lattices and numbers of sampling time points through both simulation study and a real data example have been studied. In addition, We consider the imputation of missing values is for spatial-temporal autologistic regression models. Most existing imputation methods are not admissible to impute spatial-temporal missing values, because they can disrupt the inherent structure of the data and lead to a serious bias during the inference or computing efficient issue. Two imputation methods, iteration-KNN imputation and maximum entropy imputation, are proposed, both of them are relatively simple and can yield reasonable results. In summary, the main contributions of this dissertation are the development of a spatial-temporal autologistic regression model with centered parameterization, and proposal of EMPL, MCEML, and Bayesian inference to obtain the estimations of model parameters. Also, iteration-KNN and maximum entropy imputation methods have been presented for spatial-temporal missing data, which generate reliable imputed values with the reasonable efficient imputation time.
203

Imputação AMMI Bootstrap Não-paramétrico em dados multiambientais / AMMI imputation Non-parametric bootstrap in multenvironmental data

Silva, Maria Joseane Cruz da 20 January 2017 (has links)
Em estudos multiambientais, o processo de recomendação de genótipos com maior produção e a determinação de genótipos estáveis são de suma importância para os melhoristas. Porém, quando ocorre falta de genótipo em um ou mais ambientes este processo passa a ter dificuldades. Pois, este procedimento depende de métodos estatísticos que necessitam de uma matriz de dados sem dados em falta. Desde 1976 diversos matemáticos e estatísticos estudam, continuamente, uma forma de lidar com dados em falta em dados multiambientais buscando obter um método que estime, de forma precisa, as unidades ausentes sem perda de informação. Desta forma, esta pesquisa propõe um novo método de imputação baseado na metodologia AMMI fazendo reamostragens Bootstrap Não-paramétrico na matriz de médias de interação genótipos e ambientes (G × E), o modelo de imputação AMMI Bootstrap Não-paramétrico (IAMMI-BNP). Para estudo de simulação foi considerado o conjunto de dados referente a procedência S. of Ravenshoe - Mt Pandanus - QLD (14.420) de Eucalyptus grandis coletada na Austrália em 1983. Com a finalidade de obter estimativas precisas dos valores em falta, foi considerado dois estudos de simulação. O primeiro considerou 2000 reamostragens no sentido linha da matriz de interação G × E considerando duas porcentagens de perda de dados (10% e 20 %). O segundo estudo de simulação, considerou 200 reamostragens na matriz de falta (10%) e três diferentes modelos de IAMMI-BNP: IAMMI0-BNP, que considera apenas os efeitos principais do modelo AMMI; IAMMI1-BNP e IAMMI2-BNP que considera um e dois eixos multiplicados do modelo AMMI, respectivamente. De forma geral, de acordo com os métodos de comparação o método de imputação proposto nos dois estudos de simulação forneceu valores imputados próximos dos originais. Considerando os estudos de simulação com 10% de perda, a eficiência do método de imputação proposto foi melhor quando se utilizou o modelo IAMMI2-BNP (com dois eixos multiplicativos). O teste das ordens assinaladas de Wilcoxon mostrou que os valores imputados não influenciaram na estimativa da média, indicando que valores médios dos dados imputados de cada ambiente foram estatisticamente semelhantes aos valores médios originais. / In multienvironment studies, the process of recommendation of genotypes with higher production and the determination of stable environments are of utmost importance for plant breeders. However, when there is missing of genotype in one or more environments this process show difficulties. Therefore, this procedure depends on statistical methods that complete data matrix requered. Since 1976 various mathematical and statistical study, continually, one way of dealing with the loss of information on data multienvironments, seeking to obtain a method that estimate, precisely, the missing units without loss of information. In this way, the purpose of this study is develop a new method of apportionment based on the methodology AMMI doing reamostragens bootstrap nonparametric in the array of means of genotype x environment interaction (GE). For the study of simulation was considered the data set concerning the origin of S. Mexico City - Mt Pandanus - QLD (14,420) of Eucalyptus grandis collected in Australia in 1983. It was performed two studies of simulation. The first performed 2000 resampling on the lines of the interaction matrix G X E, for two percentages of missing data (10% and 20%). The second simulation study considered 200 replicates in the missing data set (10 %) and three different models of IMAMMI-BNP: AMAMMI0-BNP, which considers only the main effects of the AMMI model; IAMMI1-BNP and IAMMI2-BNP which considers one and two axes multiplied by the AMMI model, respectively. In general, according to the comparison methods, the imputation method proposed in the two simulation studies provided imputed values similar to the originals. Considering the simulation studies with 10 % loss, the efficiency of the proposed imputation method was better when using the IAMMI2-BNP model (with two multiplicative axes). The Wilcoxon test of the orders showed that the values imputed had no influence on the mean estimate, indicating that mean values of the data imputed from each environment were statistically similar to the original mean values.
204

Ajuste de modelos e comparação de séries temporais para dados de vazão específica em microbacias pareadas / Fitting of models and comparison of time series for specific flow data in paired catchments

Amaral, Marcus Vinicius Silva Gurgel do 15 July 2014 (has links)
A crescente preocupação com o meio ambiente pressiona a sociedade como um todo para a uma mudança rumo a hábitos mais sustentáveis. No setor produtivo, o impulso se dá pelo desenvolvimento de técnicas mais eficientes de produção, embasados em pesquisas e experimentos de campo. No setor florestal, além da preocupação com a técnicas de manejo e com o solo, o principal recurso a ser preservado é a água. Por meio do monitoramento de rios em bacias hidrográficas, séries históricas são coletadas, possibilitando o uso da teoria de séries temporais para ajuste de modelos pela metodologia Box e Jenkins. Em casos de monitoramentos de microbacias pareadas, existe a possibilidade de se comparar séries temporais, como descrito no presente trabalho. Em duas microbacias pareadas localizadas na região centro-leste do estado do Paraná, em uma fazenda no município de Telêmaco Borba, dados correspondendo a duas séries temporais distintas de vazão específica foram coletados. Devido a presença de falhas nos conjuntos de dados, uma metodologia para imputação foi utilizada de duas maneiras diferentes, possibilitando a posterior comparação das duas séries temporais pela metodologia de séries temporais. De acordo com os resultados, verifica-se que ambas as séries são diferentes tanto para o teste de comparação das funções de autocorrelação, quanto para o teste de comparação de séries temporais proposto por Silva, Ferreira e Sáfadi (2000). Portanto, segundo a caracterização dos estudos em microbacias pareadas, pode-se constatar que o manejo florestal empregado nos dois locais influenciam de forma diferente no comportamento da variável avaliada. / The growing concern for the enviroment presses society as a whole for a change towards sustainable habits. Regarding the production systems, more efficient production techniques based on research and field experiments are needed. As for forestry, besides the concern with management techniques and with soil preparation, the main resource to be preserved is water. Time series are collected by monitoring rivers in drainage basins, making possible the use of time series theory for fitting models based on Box and Jenkins methodology. When studying paired drainage basins, it is possible to compare time series, as described in this work. Two time series consisting of specific flow data were collected in a farm situated in the municipality of Telêmaco Borba, Eastern Paraná state, in two paired drainage basins. Because there were missing data, imputation techniques were used, making it possible to compare the two time series. Results showed that the time series are different for the comparison of the autocorrelation test and the time series comparison test proposed by Silva, Ferreira e Sáfadi (2000). Therefore, according to studies involving paired drainage basins, different forest management techniques influence differently the behavior of the response variable in the different drainage basins.
205

Imputação AMMI Bootstrap Não-paramétrico em dados multiambientais / AMMI imputation Non-parametric bootstrap in multenvironmental data

Maria Joseane Cruz da Silva 20 January 2017 (has links)
Em estudos multiambientais, o processo de recomendação de genótipos com maior produção e a determinação de genótipos estáveis são de suma importância para os melhoristas. Porém, quando ocorre falta de genótipo em um ou mais ambientes este processo passa a ter dificuldades. Pois, este procedimento depende de métodos estatísticos que necessitam de uma matriz de dados sem dados em falta. Desde 1976 diversos matemáticos e estatísticos estudam, continuamente, uma forma de lidar com dados em falta em dados multiambientais buscando obter um método que estime, de forma precisa, as unidades ausentes sem perda de informação. Desta forma, esta pesquisa propõe um novo método de imputação baseado na metodologia AMMI fazendo reamostragens Bootstrap Não-paramétrico na matriz de médias de interação genótipos e ambientes (G × E), o modelo de imputação AMMI Bootstrap Não-paramétrico (IAMMI-BNP). Para estudo de simulação foi considerado o conjunto de dados referente a procedência S. of Ravenshoe - Mt Pandanus - QLD (14.420) de Eucalyptus grandis coletada na Austrália em 1983. Com a finalidade de obter estimativas precisas dos valores em falta, foi considerado dois estudos de simulação. O primeiro considerou 2000 reamostragens no sentido linha da matriz de interação G × E considerando duas porcentagens de perda de dados (10% e 20 %). O segundo estudo de simulação, considerou 200 reamostragens na matriz de falta (10%) e três diferentes modelos de IAMMI-BNP: IAMMI0-BNP, que considera apenas os efeitos principais do modelo AMMI; IAMMI1-BNP e IAMMI2-BNP que considera um e dois eixos multiplicados do modelo AMMI, respectivamente. De forma geral, de acordo com os métodos de comparação o método de imputação proposto nos dois estudos de simulação forneceu valores imputados próximos dos originais. Considerando os estudos de simulação com 10% de perda, a eficiência do método de imputação proposto foi melhor quando se utilizou o modelo IAMMI2-BNP (com dois eixos multiplicativos). O teste das ordens assinaladas de Wilcoxon mostrou que os valores imputados não influenciaram na estimativa da média, indicando que valores médios dos dados imputados de cada ambiente foram estatisticamente semelhantes aos valores médios originais. / In multienvironment studies, the process of recommendation of genotypes with higher production and the determination of stable environments are of utmost importance for plant breeders. However, when there is missing of genotype in one or more environments this process show difficulties. Therefore, this procedure depends on statistical methods that complete data matrix requered. Since 1976 various mathematical and statistical study, continually, one way of dealing with the loss of information on data multienvironments, seeking to obtain a method that estimate, precisely, the missing units without loss of information. In this way, the purpose of this study is develop a new method of apportionment based on the methodology AMMI doing reamostragens bootstrap nonparametric in the array of means of genotype x environment interaction (GE). For the study of simulation was considered the data set concerning the origin of S. Mexico City - Mt Pandanus - QLD (14,420) of Eucalyptus grandis collected in Australia in 1983. It was performed two studies of simulation. The first performed 2000 resampling on the lines of the interaction matrix G X E, for two percentages of missing data (10% and 20%). The second simulation study considered 200 replicates in the missing data set (10 %) and three different models of IMAMMI-BNP: AMAMMI0-BNP, which considers only the main effects of the AMMI model; IAMMI1-BNP and IAMMI2-BNP which considers one and two axes multiplied by the AMMI model, respectively. In general, according to the comparison methods, the imputation method proposed in the two simulation studies provided imputed values similar to the originals. Considering the simulation studies with 10 % loss, the efficiency of the proposed imputation method was better when using the IAMMI2-BNP model (with two multiplicative axes). The Wilcoxon test of the orders showed that the values imputed had no influence on the mean estimate, indicating that mean values of the data imputed from each environment were statistically similar to the original mean values.
206

Methods and software to enhance statistical analysis in large scale problems in breeding and quantitative genetics

Pook, Torsten 27 June 2019 (has links)
No description available.
207

Systèmes experts à base de connaissances profondes : application à un poste de travail intelligent pour le comptable

Page, Michel 02 February 1990 (has links) (PDF)
La plupart des systèmes experts actuels reposent sur les connaissances de surface (le savoir-faire) d'un expert du domaine d'application. Plus récemment, une autre approche s'est développée. Elle vise a exploiter les connaissances profondes (théoriques) acquises dans le domaine d'application. La thèse étudie cette dernière approche dans le cadre du projet pic (poste de travail intelligent pour le comptable). Les aspects méthodologiques sont développés dans la première partie. Une nouvelle classe d'applications des systèmes experts est proposée: l'interprétation comparative. Elle a pour but de mettre en évidence et expliquer la cause des différences entre deux états d'un système. Une methode générale permettant d'aborder ce probleme est présentée, ainsi que des techniques la mettant en œuvre sur des modèles qualitatifs et numériques. Les contributions au projet pic sont développées dans la seconde partie. Un générateur de systèmes experts d'interprétation comparative est d'abord présenté. Il a servi a la réalisation de deux systèmes: le premier pour l'analyse de la performance d'une entreprise par la methode des surplus, le second pour le diagnostic financier d'entreprise. Un système expert pour la déduction des écritures comptables utilisant également l'approche profonde est ensuite présenté. A la lumière de ces deux dernières applications déjà abordées par les systèmes experts utilisant des connaissances de surface, les deux approches de conception de systèmes experts sont comparées
208

A Study of Missing Data Imputation and Predictive Modeling of Strength Properties of Wood Composites

Zeng, Yan 01 August 2011 (has links)
Problem: Real-time process and destructive test data were collected from a wood composite manufacturer in the U.S. to develop real-time predictive models of two key strength properties (Modulus of Rupture (MOR) and Internal Bound (IB)) of a wood composite manufacturing process. Sensor malfunction and data “send/retrieval” problems lead to null fields in the company’s data warehouse which resulted in information loss. Many manufacturers attempt to build accurate predictive models excluding entire records with null fields or using summary statistics such as mean or median in place of the null field. However, predictive model errors in validation may be higher in the presence of information loss. In addition, the selection of predictive modeling methods poses another challenge to many wood composite manufacturers. Approach: This thesis consists of two parts addressing above issues: 1) how to improve data quality using missing data imputation; 2) what predictive modeling method is better in terms of prediction precision (measured by root mean square error or RMSE). The first part summarizes an application of missing data imputation methods in predictive modeling. After variable selection, two missing data imputation methods were selected after comparing six possible methods. Predictive models of imputed data were developed using partial least squares regression (PLSR) and compared with models of non-imputed data using ten-fold cross-validation. Root mean square error of prediction (RMSEP) and normalized RMSEP (NRMSEP) were calculated. The second presents a series of comparisons among four predictive modeling methods using imputed data without variable selection. Results: The first part concludes that expectation-maximization (EM) algorithm and multiple imputation (MI) using Markov Chain Monte Carlo (MCMC) simulation achieved more precise results. Predictive models based on imputed datasets generated more precise prediction results (average NRMSEP of 5.8% for model of MOR model and 7.2% for model of IB) than models of non-imputed datasets (average NRMSEP of 6.3% for model of MOR and 8.1% for model of IB). The second part finds that Bayesian Additive Regression Tree (BART) produced most precise prediction results (average NRMSEP of 7.7% for MOR model and 8.6% for IB model) than other three models: PLSR, LASSO, and Adaptive LASSO.
209

Sensitivity Analyses in Empirical Studies Plagued with Missing Data

Liublinska, Viktoriia 07 June 2014 (has links)
Analyses of data with missing values often require assumptions about missingness mechanisms that cannot be assessed empirically, highlighting the need for sensitivity analyses. However, universal recommendations for reporting missing data and conducting sensitivity analyses in empirical studies are scarce. Both steps are often neglected by practitioners due to the lack of clear guidelines for summarizing missing data and systematic explorations of alternative assumptions, as well as the typical attendant complexity of missing not at random (MNAR) models. We propose graphical displays that help visualize and systematize the results of sensitivity analyses, building upon the idea of "tipping-point" analysis for experiments with dichotomous treatment. The resulting "enhanced tipping-point displays" (ETP) are convenient summaries of conclusions drawn from using different modeling assumptions about the missingness mechanisms, applicable to a broad range of outcome distributions. We also describe a systematic way of exploring MNAR models using ETP displays, based on a pattern-mixture factorization of the outcome distribution, and present a set of sensitivity parameters that arises naturally from such a factorization. The primary goal of the displays is to make formal sensitivity analyses more comprehensible to practitioners, thereby helping them assess the robustness of experiments' conclusions. We also present an example of a recent use of ETP displays in a medical device clinical trial, which helped lead to FDA approval. The last part of the dissertation demonstrates another method of sensitivity analysis in the same clinical trial. The trial is complicated by missingness in outcomes "due to death", and we address this issue by employing Rubin Causal Model and principal stratification. We propose an improved method to estimate the joint posterior distribution of estimands of interest using a Hamiltonian Monte Carlo algorithm and demonstrate its superiority for this problem to the standard Metropolis-Hastings algorithm. The proposed methods of sensitivity analyses provide new collections of useful tools for the analysis of data sets plagued with missing values. / Statistics
210

Modélisation des données d'enquêtes cas-cohorte par imputation multiple : Application en épidémiologie cardio-vasculaire.

Marti soler, Helena 04 May 2012 (has links) (PDF)
Les estimateurs pondérés généralement utilisés pour analyser les enquêtes cas-cohorte ne sont pas pleinement efficaces. Or, les enquêtes cas-cohorte sont un cas particulier de données incomplètes où le processus d'observation est contrôlé par les organisateurs de l'étude. Ainsi, des méthodes d'analyse pour données manquant au hasard (MA) peuvent être pertinentes, en particulier, l'imputation multiple, qui utilise toute l'information disponible et permet d'approcher l'estimateur du maximum de vraisemblance partielle.Cette méthode est fondée sur la génération de plusieurs jeux plausibles de données complétées prenant en compte les différents niveaux d'incertitude sur les données manquantes. Elle permet d'adapter facilement n'importe quel outil statistique disponible pour les données de cohorte, par exemple, l'estimation de la capacité prédictive d'un modèle ou d'une variable additionnelle qui pose des problèmes spécifiques dans les enquêtes cas-cohorte. Nous avons montré que le modèle d'imputation doit être estimé à partir de tous les sujets complètement observés (cas et non-cas) en incluant l'indicatrice de statut parmi les variables explicatives. Nous avons validé cette approche à l'aide de plusieurs séries de simulations: 1) données complètement simulées, où nous connaissions les vraies valeurs des paramètres, 2) enquêtes cas-cohorte simulées à partir de la cohorte PRIME, où nous ne disposions pas d'une variable de phase-1 (observée sur tous les sujets) fortement prédictive de la variable de phase-2 (incomplètement observée), 3) enquêtes cas-cohorte simulées à partir de la cohorte NWTS, où une variable de phase-1 fortement prédictive de la variable de phase-2 était disponible. Ces simulations ont montré que l'imputation multiple fournissait généralement des estimateurs sans biais des risques relatifs. Pour les variables de phase-1, ils approchaient la précision obtenue par l'analyse de la cohorte complète, ils étaient légèrement plus précis que l'estimateur calibré de Breslow et coll. et surtout que les estimateurs pondérés classiques. Pour les variables de phase-2, l'estimateur de l'imputation multiple était généralement sans biais et d'une précision supérieure à celle des estimateurs pondérés classiques et analogue à celle de l'estimateur calibré. Les résultats des simulations réalisées à partir des données de la cohorte NWTS étaient cependant moins bons pour les effets impliquant la variable de phase-2 : les estimateurs de l'imputation multiple étaient légèrement biaisés et moins précis que les estimateurs pondérés. Cela s'explique par la présence de termes d'interaction impliquant la variable de phase-2 dans le modèle d'analyse, d'où la nécessité d'estimer des modèles d'imputation spécifiques à différentes strates de la cohorte incluant parfois trop peu de cas pour que les conditions asymptotiques soient réunies.Nous recommandons d'utiliser l'imputation multiple pour obtenir des estimations plus précises des risques relatifs, tout en s'assurant qu'elles sont analogues à celles fournies par les analyses pondérées. Nos simulations ont également montré que l'imputation multiple fournissait des estimations de la valeur prédictive d'un modèle (C de Harrell) ou d'une variable additionnelle (différence des indices C, NRI ou IDI) analogues à celles fournies par la cohorte complète

Page generated in 0.1446 seconds