Spelling suggestions: "subject:"cultiple imputation"" "subject:"bmultiple imputation""
71 |
Bayesian estimation of factor analysis models with incomplete dataMerkle, Edgar C. 10 October 2005 (has links)
No description available.
|
72 |
Multiple comparisons using multiple imputation under a two-way mixed effects interaction modelKosler, Joseph Stephen 22 September 2006 (has links)
No description available.
|
73 |
EFFECT OF SMOKING AND CESSATION IN HIV-INFECTED PEOPLECui, Qu 10 1900 (has links)
<p>Cigarette smoking is prevalent in HIV-infected people, resulting in higher mortality rate and more premature heart and lung diseases in the highly active antiretroviral therapy era. Smoking is a modifiable risk factor for these adverse outcomes and smoking cessation in HIV-positive smokers is feasible, although further efforts are needed to improve smoking cessation programs in HIV-positive persons.</p> <p>In this thesis, I examined the role of smoking in mortality and morbidity in HIV-positive Ontarians, and piloted a smoking cessation program featuring a novel smoking cessation aid, varenicline, in HIV-infected smokers. In addition, I explored three different methods to resolve missing data, by excluding, grouping and multiply imputing missing data. I adopted three different study designs in my thesis studies: retrospective cohort, cross-sectional and open label study.</p> <p>We found smoking prevalence in HIV-infected people was consistently higher than in the general population. Smoking was associated with a higher risk of death, of respiratory symptoms, hospitalization and chronic obstructive pulmonary disease, and with reduced lung function and less CD4-T-lymphocyte improvement over time. We found varenicline was as effective in HIV-positive smokers as in non-HIV smokers reported by previous studies.</p> / Doctor of Philosophy (PhD)
|
74 |
Lean Implementation and the Role of Lean Accounting in the Transportation Equipment Manufacturing IndustryAndersch, Adrienn 13 November 2014 (has links)
Implementing Lean in the United States transportation equipment manufacturing industry holds the promise for improvements in, among other things, productivity, quality, and innovation, resulting in more competitive success and profits. Although Lean has been applied throughout the industry with noted success, there have been some difficulties in demonstrating the financial benefits derived from Lean initiatives. Most of the evidence supporting a positive relationship between Lean implementation and improved financial performance is anecdotal. As companies have become more proficient in carrying out Lean initiatives in manufacturing, they have extended Lean ideas to other parts of their organization and throughout the entire supply chain. Nowadays, it is widely recognized that a holistic, enterprise-wide view is critical to obtain the potential benefits of a Lean transformation.
However, Lean transformations are often undertaken without consideration of supporting functions such as accounting and finance. Lean transformation in accounting and finance should be run in the same way as it is in the manufacturing environment by decreasing reporting cycle time, improving transaction processing accuracy, eliminating unnecessary transaction processing, changing product costing procedures, and financial reporting among many other things, but there is limited empirical evidence of that happening.
To address these shortcomings, this research focuses on three areas. First, this study aims to evaluate transportation equipment manufacturing facilities in respect to their operational and financial performance. Second, this study aims to investigate the extent of Lean implementation of a given operation in respect to leadership, manufacturing, accounting and finance, and supplier and customer relationship and correlate these results to their performance. Finally, this study aims to further examine the contextual characteristics of companies that successfully aligned their systems with Lean.
A mixed-mode survey, addressed to a subset of the United States transportation equipment manufacturing industry, asked questions pertinent to companies' Lean transformation efforts, performance, and general characteristics. During the four months long survey period, a total of 69 valid responses were received, for a response rate of 3.78 percent. From the 69 valid responses, 8 responses were eliminated due to containing more than 20 percent missing values. Multiple imputation procedure was applied to handle remaining missing values in the dataset. Before testing study hypotheses, scale reliability and construct validity tests were run to decide whether a particular survey item should be retained in further analysis. Study hypotheses were then tested using profile deviation analysis, multiple regression analysis, and hierarchical regression analysis.
When the level of Lean implementation and performance relationship was investigated using a multiple regression analysis, results did not show any evidence that the higher level of Lean implementation along four business dimensions (leadership, manufacturing, accounting and finance, and supplier and customer relationship) of transportation equipment manufacturing facilities positively influences their operational and financial performance. However, it was revealed that the higher level of Lean implementation in transportation equipment manufacturing facilities' manufacturing dimension resulted in better quality performance as measured by first-time through, inbound quality, and outbound quality. When the same relationship was investigated using a profile deviation analysis, results were identical.
When the level of Lean implementation in accounting and finance and its relationship with performance was investigated using a single regression analysis, results showed that the higher level of Lean implementation in transportation equipment manufacturing facilities' accounting and finance dimension has a positive effect on accounting performance and on operational performance (e.g., on time-based performance and delivery-based performance), but no effect on financial performance. When the same relationship was investigated using a profile deviation analysis, results were different by showing no relationship between the level of Lean implementation in transportation equipment manufacturing facilities' accounting and finance dimension and accounting, operational, and financial performance.
Lastly, the effect of contextual variables (e.g., industry segment, location, annual sales volume, and unionization) on performance, the level of Lean implementation, and the performance -- Lean implementation relationship was investigated using hierarchical regression. Results showed that transportation equipment manufacturing facilities' performance is influenced by annual sales volume. Their level of Lean implementation in the accounting and finance dimension is influenced by location, while their performance -- Lean implementation in the accounting and finance dimension relationship is influenced by industry segment. / Ph. D.
|
75 |
Estimating Individual Treatment Effects Using Emerging Methods from Machine Learning and Multiple ImputationPark, Sangbaek January 2024 (has links)
This dissertation used synthetic datasets, semi-synthetic datasets, and a real-world dataset from an educational intervention to compare the performance of 15 machine learning and multiple imputation methods to estimate the individual treatment effect (ITE). In addition, it examined the performance of five evaluation metrics that can be used to identify the best ITE estimation method when conducting research with real-world data.
Among the ITE estimation methods that were analyzed, the S-learner, the Bayesian Causal Forest (BCF), the Causal Forest, and the X-learner exhibited the best performance. In general, the meta-learners with BART and tree-based direct estimation methods performed better than the representation learning methods and the multiple imputation methods. As for the evaluation metrics, τ_(risk_R ) and the Switch Doubly Robust MSE (SDR-MSE) performed the best in identifying the best ITE estimation method when the true treatment effect was unknown.
This dissertation contributes to a small but growing body of research on ITE estimation which is gaining popularity in various fields due to its potential for tailoring interventions to meet the needs of individuals and targeting programs at those who would benefit the most from those interventions.
|
76 |
Kontexteffekte in Large-Scale AssessmentsWeirich, Sebastian 13 August 2015 (has links)
Im Rahmen der Item-Response-Theorie evaluiert die kumulative Dissertationsschrift verschiedene Methoden und Modelle zur Identifikation von Kontexteffekten in Large-Scale Assessments. Solche Effekte können etwa in quantitativen empirischen Schulleistungsstudien auftreten und zu verzerrten Item- und Personenparametern führen. Um in Einzelfällen abschätzen zu können, ob Kontexteffekte auftreten und dadurch die Gefahr verzerrter Parameter gegeben ist (und falls ja, in welcher Weise), müssen IRT-Modelle entwickelt werden, die zusätzlich zu Item- und Personeneffekten Kontexteffekte parametrisieren. Solch eine Parametrisierung ist im Rahmen Generalisierter Allgemeiner Linearer Modelle möglich. In der Dissertation werden Positionseffekte als ein Beispiel für Kontexteffekte untersucht, und es werden die statistischen Eigenschaften dieses Messmodells im Rahmen einer Simulationsstudie evaluiert. Hier zeigt sich vor allem die Bedeutung des Testdesigns: Um unverfälschte Parameter zu gewinnen, ist nicht nur ein adäquates Messmodell, sondern ebenso ein adäquates, also ausbalanciertes Testdesign notwendig. Der dritte Beitrag der Dissertation befasst sich mit dem Problem fehlender Werte auf Hintergrundvariablen in Large-Scale Assessments. Als Kontexteffekt wird in diesem Beispiel derjenige Effekt verstanden, der die Wahrscheinlichkeit eines fehlenden Wertes auf einer bestimmten Variablen systematisch beeinflusst. Dabei wurde das Prinzip der multiplen Imputation auf das Problem fehlender Werte auf Hintergrundvariablen übertragen. Anders als bisher praktizierte Ansätze (Dummy-Codierung fehlender Werte) konnten so in einer Simulationsstudie für fast alle Simulationsbedingungen unverfälschte Parameter auf der Personenseite gefunden werden. / The present doctoral thesis evaluates various methods and models of the item response theory to parametrize context effects in large-scale assessments. Such effects may occur in quantitative educational assessments and may cause biased item and person parameter estimates. To decide whether context effects occur in individual cases and lead to biased parameters, specific IRT models have to be developed which parametrize context effects additionally to item and person effects. The present doctoral thesis consists of three single contributions. In the first contribution, a model for the estimation of context effects in an IRT framework is introduced. Item position effects are examined as an example of context effects in the framework of generalized linear mixed models. Using simulation studies, the statistical properties of the model are investigated, which emphasizes the relevance of an appropriate test design. A balanced incomplete test design is necessary not only to obtain valid item parameters in the Rasch model, but to guarantee for unbiased estimation of position effects in more complex IRT models. The third contribution deals with the problem of missing background data in large-scale assessments. The effect which predicts the probability of a missing value on a certain variable, is considered as a context effect. Statistical methods of multiple imputation were brought up to the problem of missing background data in large-scale assessments. In contrast to other approaches used so far in practice (dummy coding of missing values) unbiased population and subpopulation estimates were received in a simulation study for most conditions.
|
77 |
Imputação AMMI Bootstrap Não-paramétrico em dados multiambientais / AMMI imputation Non-parametric bootstrap in multenvironmental dataSilva, Maria Joseane Cruz da 20 January 2017 (has links)
Em estudos multiambientais, o processo de recomendação de genótipos com maior produção e a determinação de genótipos estáveis são de suma importância para os melhoristas. Porém, quando ocorre falta de genótipo em um ou mais ambientes este processo passa a ter dificuldades. Pois, este procedimento depende de métodos estatísticos que necessitam de uma matriz de dados sem dados em falta. Desde 1976 diversos matemáticos e estatísticos estudam, continuamente, uma forma de lidar com dados em falta em dados multiambientais buscando obter um método que estime, de forma precisa, as unidades ausentes sem perda de informação. Desta forma, esta pesquisa propõe um novo método de imputação baseado na metodologia AMMI fazendo reamostragens Bootstrap Não-paramétrico na matriz de médias de interação genótipos e ambientes (G × E), o modelo de imputação AMMI Bootstrap Não-paramétrico (IAMMI-BNP). Para estudo de simulação foi considerado o conjunto de dados referente a procedência S. of Ravenshoe - Mt Pandanus - QLD (14.420) de Eucalyptus grandis coletada na Austrália em 1983. Com a finalidade de obter estimativas precisas dos valores em falta, foi considerado dois estudos de simulação. O primeiro considerou 2000 reamostragens no sentido linha da matriz de interação G × E considerando duas porcentagens de perda de dados (10% e 20 %). O segundo estudo de simulação, considerou 200 reamostragens na matriz de falta (10%) e três diferentes modelos de IAMMI-BNP: IAMMI0-BNP, que considera apenas os efeitos principais do modelo AMMI; IAMMI1-BNP e IAMMI2-BNP que considera um e dois eixos multiplicados do modelo AMMI, respectivamente. De forma geral, de acordo com os métodos de comparação o método de imputação proposto nos dois estudos de simulação forneceu valores imputados próximos dos originais. Considerando os estudos de simulação com 10% de perda, a eficiência do método de imputação proposto foi melhor quando se utilizou o modelo IAMMI2-BNP (com dois eixos multiplicativos). O teste das ordens assinaladas de Wilcoxon mostrou que os valores imputados não influenciaram na estimativa da média, indicando que valores médios dos dados imputados de cada ambiente foram estatisticamente semelhantes aos valores médios originais. / In multienvironment studies, the process of recommendation of genotypes with higher production and the determination of stable environments are of utmost importance for plant breeders. However, when there is missing of genotype in one or more environments this process show difficulties. Therefore, this procedure depends on statistical methods that complete data matrix requered. Since 1976 various mathematical and statistical study, continually, one way of dealing with the loss of information on data multienvironments, seeking to obtain a method that estimate, precisely, the missing units without loss of information. In this way, the purpose of this study is develop a new method of apportionment based on the methodology AMMI doing reamostragens bootstrap nonparametric in the array of means of genotype x environment interaction (GE). For the study of simulation was considered the data set concerning the origin of S. Mexico City - Mt Pandanus - QLD (14,420) of Eucalyptus grandis collected in Australia in 1983. It was performed two studies of simulation. The first performed 2000 resampling on the lines of the interaction matrix G X E, for two percentages of missing data (10% and 20%). The second simulation study considered 200 replicates in the missing data set (10 %) and three different models of IMAMMI-BNP: AMAMMI0-BNP, which considers only the main effects of the AMMI model; IAMMI1-BNP and IAMMI2-BNP which considers one and two axes multiplied by the AMMI model, respectively. In general, according to the comparison methods, the imputation method proposed in the two simulation studies provided imputed values similar to the originals. Considering the simulation studies with 10 % loss, the efficiency of the proposed imputation method was better when using the IAMMI2-BNP model (with two multiplicative axes). The Wilcoxon test of the orders showed that the values imputed had no influence on the mean estimate, indicating that mean values of the data imputed from each environment were statistically similar to the original mean values.
|
78 |
Imputação AMMI Bootstrap Não-paramétrico em dados multiambientais / AMMI imputation Non-parametric bootstrap in multenvironmental dataMaria Joseane Cruz da Silva 20 January 2017 (has links)
Em estudos multiambientais, o processo de recomendação de genótipos com maior produção e a determinação de genótipos estáveis são de suma importância para os melhoristas. Porém, quando ocorre falta de genótipo em um ou mais ambientes este processo passa a ter dificuldades. Pois, este procedimento depende de métodos estatísticos que necessitam de uma matriz de dados sem dados em falta. Desde 1976 diversos matemáticos e estatísticos estudam, continuamente, uma forma de lidar com dados em falta em dados multiambientais buscando obter um método que estime, de forma precisa, as unidades ausentes sem perda de informação. Desta forma, esta pesquisa propõe um novo método de imputação baseado na metodologia AMMI fazendo reamostragens Bootstrap Não-paramétrico na matriz de médias de interação genótipos e ambientes (G × E), o modelo de imputação AMMI Bootstrap Não-paramétrico (IAMMI-BNP). Para estudo de simulação foi considerado o conjunto de dados referente a procedência S. of Ravenshoe - Mt Pandanus - QLD (14.420) de Eucalyptus grandis coletada na Austrália em 1983. Com a finalidade de obter estimativas precisas dos valores em falta, foi considerado dois estudos de simulação. O primeiro considerou 2000 reamostragens no sentido linha da matriz de interação G × E considerando duas porcentagens de perda de dados (10% e 20 %). O segundo estudo de simulação, considerou 200 reamostragens na matriz de falta (10%) e três diferentes modelos de IAMMI-BNP: IAMMI0-BNP, que considera apenas os efeitos principais do modelo AMMI; IAMMI1-BNP e IAMMI2-BNP que considera um e dois eixos multiplicados do modelo AMMI, respectivamente. De forma geral, de acordo com os métodos de comparação o método de imputação proposto nos dois estudos de simulação forneceu valores imputados próximos dos originais. Considerando os estudos de simulação com 10% de perda, a eficiência do método de imputação proposto foi melhor quando se utilizou o modelo IAMMI2-BNP (com dois eixos multiplicativos). O teste das ordens assinaladas de Wilcoxon mostrou que os valores imputados não influenciaram na estimativa da média, indicando que valores médios dos dados imputados de cada ambiente foram estatisticamente semelhantes aos valores médios originais. / In multienvironment studies, the process of recommendation of genotypes with higher production and the determination of stable environments are of utmost importance for plant breeders. However, when there is missing of genotype in one or more environments this process show difficulties. Therefore, this procedure depends on statistical methods that complete data matrix requered. Since 1976 various mathematical and statistical study, continually, one way of dealing with the loss of information on data multienvironments, seeking to obtain a method that estimate, precisely, the missing units without loss of information. In this way, the purpose of this study is develop a new method of apportionment based on the methodology AMMI doing reamostragens bootstrap nonparametric in the array of means of genotype x environment interaction (GE). For the study of simulation was considered the data set concerning the origin of S. Mexico City - Mt Pandanus - QLD (14,420) of Eucalyptus grandis collected in Australia in 1983. It was performed two studies of simulation. The first performed 2000 resampling on the lines of the interaction matrix G X E, for two percentages of missing data (10% and 20%). The second simulation study considered 200 replicates in the missing data set (10 %) and three different models of IMAMMI-BNP: AMAMMI0-BNP, which considers only the main effects of the AMMI model; IAMMI1-BNP and IAMMI2-BNP which considers one and two axes multiplied by the AMMI model, respectively. In general, according to the comparison methods, the imputation method proposed in the two simulation studies provided imputed values similar to the originals. Considering the simulation studies with 10 % loss, the efficiency of the proposed imputation method was better when using the IAMMI2-BNP model (with two multiplicative axes). The Wilcoxon test of the orders showed that the values imputed had no influence on the mean estimate, indicating that mean values of the data imputed from each environment were statistically similar to the original mean values.
|
79 |
Modélisation des données d'enquêtes cas-cohorte par imputation multiple : application en épidémiologie cardio-vasculaire / Modeling of case-cohort data by multiple imputation : application to cardio-vascular epidemiologyMarti soler, Helena 04 May 2012 (has links)
Les estimateurs pondérés généralement utilisés pour analyser les enquêtes cas-cohorte ne sont pas pleinement efficaces. Or, les enquêtes cas-cohorte sont un cas particulier de données incomplètes où le processus d'observation est contrôlé par les organisateurs de l'étude. Ainsi, des méthodes d'analyse pour données manquant au hasard (MA) peuvent être pertinentes, en particulier, l'imputation multiple, qui utilise toute l'information disponible et permet d'approcher l'estimateur du maximum de vraisemblance partielle.Cette méthode est fondée sur la génération de plusieurs jeux plausibles de données complétées prenant en compte les différents niveaux d'incertitude sur les données manquantes. Elle permet d'adapter facilement n'importe quel outil statistique disponible pour les données de cohorte, par exemple, l'estimation de la capacité prédictive d'un modèle ou d'une variable additionnelle qui pose des problèmes spécifiques dans les enquêtes cas-cohorte. Nous avons montré que le modèle d'imputation doit être estimé à partir de tous les sujets complètement observés (cas et non-cas) en incluant l'indicatrice de statut parmi les variables explicatives. Nous avons validé cette approche à l'aide de plusieurs séries de simulations: 1) données complètement simulées, où nous connaissions les vraies valeurs des paramètres, 2) enquêtes cas-cohorte simulées à partir de la cohorte PRIME, où nous ne disposions pas d'une variable de phase-1 (observée sur tous les sujets) fortement prédictive de la variable de phase-2 (incomplètement observée), 3) enquêtes cas-cohorte simulées à partir de la cohorte NWTS, où une variable de phase-1 fortement prédictive de la variable de phase-2 était disponible. Ces simulations ont montré que l'imputation multiple fournissait généralement des estimateurs sans biais des risques relatifs. Pour les variables de phase-1, ils approchaient la précision obtenue par l'analyse de la cohorte complète, ils étaient légèrement plus précis que l'estimateur calibré de Breslow et coll. et surtout que les estimateurs pondérés classiques. Pour les variables de phase-2, l'estimateur de l'imputation multiple était généralement sans biais et d'une précision supérieure à celle des estimateurs pondérés classiques et analogue à celle de l'estimateur calibré. Les résultats des simulations réalisées à partir des données de la cohorte NWTS étaient cependant moins bons pour les effets impliquant la variable de phase-2 : les estimateurs de l'imputation multiple étaient légèrement biaisés et moins précis que les estimateurs pondérés. Cela s'explique par la présence de termes d'interaction impliquant la variable de phase-2 dans le modèle d'analyse, d'où la nécessité d'estimer des modèles d'imputation spécifiques à différentes strates de la cohorte incluant parfois trop peu de cas pour que les conditions asymptotiques soient réunies.Nous recommandons d'utiliser l'imputation multiple pour obtenir des estimations plus précises des risques relatifs, tout en s'assurant qu'elles sont analogues à celles fournies par les analyses pondérées. Nos simulations ont également montré que l'imputation multiple fournissait des estimations de la valeur prédictive d'un modèle (C de Harrell) ou d'une variable additionnelle (différence des indices C, NRI ou IDI) analogues à celles fournies par la cohorte complète / The weighted estimators generally used for analyzing case-cohort studies are not fully efficient. However, case-cohort surveys are a special type of incomplete data in which the observation process is controlled by the study organizers. So, methods for analyzing Missing At Random (MAR) data could be appropriate, in particular, multiple imputation, which uses all the available information and allows to approximate the partial maximum likelihood estimator.This approach is based on the generation of several plausible complete data sets, taking into account all the uncertainty about the missing values. It allows adapting any statistical tool available for cohort data, for instance, estimators of the predictive ability of a model or of an additional variable, which meet specific problems with case-cohort data. We have shown that the imputation model must be estimated on all the completely observed subjects (cases and non-cases) including the case indicator among the explanatory variables. We validated this approach with several sets of simulations: 1) completely simulated data where the true parameter values were known, 2) case-cohort data simulated from the PRIME cohort, without any phase-1 variable (completely observed) strongly predictive of the phase-2 variable (incompletely observed), 3) case-cohort data simulated from de NWTS cohort, where a phase-1 variable strongly predictive of the phase-2 variable was available. These simulations showed that multiple imputation generally provided unbiased estimates of the risk ratios. For the phase-1 variables, they were almost as precise as the estimates provided by the full cohort, slightly more precise than Breslow et al. calibrated estimator and still more precise than classical weighted estimators. For the phase-2 variables, the multiple imputation estimator was generally unbiased, with a precision better than classical weighted estimators and similar to Breslow et al. calibrated estimator. The simulations performed with the NWTS cohort data provided less satisfactory results for the effects where the phase-2 variable was involved: the multiple imputation estimators were slightly biased and less precise than the weighted estimators. This can be explained by the interactions terms involving the phase-2 variable in the analysis model and the necessity of estimating specific imputation models in different strata not including sometimes enough cases to satisfy the asymptotic conditions. We advocate the use of multiple imputation for improving the precision of the risk ratios estimates while making sure they are similar to the weighted estimates.Our simulations also showed that multiple imputation provided estimates of a model predictive value (Harrell's C) or of an additional variable (difference of C indices, NRI or IDI) similar to those obtained from the full cohort.
|
80 |
Sensitivity Analyses in Empirical Studies Plagued with Missing DataLiublinska, Viktoriia 07 June 2014 (has links)
Analyses of data with missing values often require assumptions about missingness mechanisms that cannot be assessed empirically, highlighting the need for sensitivity analyses. However, universal recommendations for reporting missing data and conducting sensitivity analyses in empirical studies are scarce. Both steps are often neglected by practitioners due to the lack of clear guidelines for summarizing missing data and systematic explorations of alternative assumptions, as well as the typical attendant complexity of missing not at random (MNAR) models. We propose graphical displays that help visualize and systematize the results of sensitivity analyses, building upon the idea of "tipping-point" analysis for experiments with dichotomous treatment. The resulting "enhanced tipping-point displays" (ETP) are convenient summaries of conclusions drawn from using different modeling assumptions about the missingness mechanisms, applicable to a broad range of outcome distributions. We also describe a systematic way of exploring MNAR models using ETP displays, based on a pattern-mixture factorization of the outcome distribution, and present a set of sensitivity parameters that arises naturally from such a factorization. The primary goal of the displays is to make formal sensitivity analyses more comprehensible to practitioners, thereby helping them assess the robustness of experiments' conclusions. We also present an example of a recent use of ETP displays in a medical device clinical trial, which helped lead to FDA approval. The last part of the dissertation demonstrates another method of sensitivity analysis in the same clinical trial. The trial is complicated by missingness in outcomes "due to death", and we address this issue by employing Rubin Causal Model and principal stratification. We propose an improved method to estimate the joint posterior distribution of estimands of interest using a Hamiltonian Monte Carlo algorithm and demonstrate its superiority for this problem to the standard Metropolis-Hastings algorithm. The proposed methods of sensitivity analyses provide new collections of useful tools for the analysis of data sets plagued with missing values. / Statistics
|
Page generated in 0.1319 seconds