1 |
Planned Missing Data in Mediation AnalysisJanuary 2015 (has links)
abstract: This dissertation examines a planned missing data design in the context of mediational analysis. The study considered a scenario in which the high cost of an expensive mediator limited sample size, but in which less expensive mediators could be gathered on a larger sample size. Simulated multivariate normal data were generated from a latent variable mediation model with three observed indicator variables, M1, M2, and M3. Planned missingness was implemented on M1 under the missing completely at random mechanism. Five analysis methods were employed: latent variable mediation model with all three mediators as indicators of a latent construct (Method 1), auxiliary variable model with M1 as the mediator and M2 and M3 as auxiliary variables (Method 2), auxiliary variable model with M1 as the mediator and M2 as a single auxiliary variable (Method 3), maximum likelihood estimation including all available data but incorporating only mediator M1 (Method 4), and listwise deletion (Method 5).
The main outcome of interest was empirical power to detect the mediated effect. The main effects of mediation effect size, sample size, and missing data rate performed as expected with power increasing for increasing mediation effect sizes, increasing sample sizes, and decreasing missing data rates. Consistent with expectations, power was the greatest for analysis methods that included all three mediators, and power decreased with analysis methods that included less information. Across all design cells relative to the complete data condition, Method 1 with 20% missingness on M1 produced only 2.06% loss in power for the mediated effect; with 50% missingness, 6.02% loss; and 80% missingess, only 11.86% loss. Method 2 exhibited 20.72% power loss at 80% missingness, even though the total amount of data utilized was the same as Method 1. Methods 3 – 5 exhibited greater power loss. Compared to an average power loss of 11.55% across all levels of missingness for Method 1, average power losses for Methods 3, 4, and 5 were 23.87%, 29.35%, and 32.40%, respectively. In conclusion, planned missingness in a multiple mediator design may permit higher quality characterization of the mediator construct at feasible cost. / Dissertation/Thesis / Doctoral Dissertation Psychology 2015
|
2 |
Performance of Imputation Algorithms on Artificially Produced Missing at Random DataOketch, Tobias O 01 May 2017 (has links)
Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.
However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software R. We assess how these MI methods perform with different percentages of missing data. A multiple regression model was fit on the imputed data sets and the complete data set. Statistical comparisons of the regression coefficients are made between the models using the imputed data and the complete data.
|
3 |
Missing Data - A Gentle IntroductionÖsterlund, Vilgot January 2020 (has links)
This thesis provides an introduction to methods for handling missing data. A thorough review of earlier methods and the development of the field of missing data is provided. The thesis present the methods suggested in today’s literature, multiple imputation and maximum likelihood estimation. A simulation study is performed to see if there are circumstances in small samples when any of the two methods are to be preferred. To show the importance of handling missing data, multiple imputation and maximum likelihood are compared to listwise deletion. The results from the simulation study does not show any crucial differences between multiple imputation and maximum likelihood when it comes to point estimates. Some differences are seen in the estimation of the confidence intervals, talking in favour of multiple imputation. The difference is decreasing with an increasing sample size and more studies are needed to draw definite conclusions. Further, the results shows that listwise deletion lead to biased estimations under a missing at random mechanism. The methods are also applied to a real dataset, the Swedish enrollment registry, to show how the methods work in a practical application.
|
4 |
Methodology for Handling Missing Data in Nonlinear Mixed Effects ModellingJohansson, Åsa M. January 2014 (has links)
To obtain a better understanding of the pharmacokinetic and/or pharmacodynamic characteristics of an investigated treatment, clinical data is often analysed with nonlinear mixed effects modelling. The developed models can be used to design future clinical trials or to guide individualised drug treatment. Missing data is a frequently encountered problem in analyses of clinical data, and to not venture the predictability of the developed model, it is of great importance that the method chosen to handle the missing data is adequate for its purpose. The overall aim of this thesis was to develop methods for handling missing data in the context of nonlinear mixed effects models and to compare strategies for handling missing data in order to provide guidance for efficient handling and consequences of inappropriate handling of missing data. In accordance with missing data theory, all missing data can be divided into three categories; missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). When data are MCAR, the underlying missing data mechanism does not depend on any observed or unobserved data; when data are MAR, the underlying missing data mechanism depends on observed data but not on unobserved data; when data are MNAR, the underlying missing data mechanism depends on the unobserved data itself. Strategies and methods for handling missing observation data and missing covariate data were evaluated. These evaluations showed that the most frequently used estimation algorithm in nonlinear mixed effects modelling (first-order conditional estimation), resulted in biased parameter estimates independent on missing data mechanism. However, expectation maximization (EM) algorithms (e.g. importance sampling) resulted in unbiased and precise parameter estimates as long as data were MCAR or MAR. When the observation data are MNAR, a proper method for handling the missing data has to be applied to obtain unbiased and precise parameter estimates, independent on estimation algorithm. The evaluation of different methods for handling missing covariate data showed that a correctly implemented multiple imputations method and full maximum likelihood modelling methods resulted in unbiased and precise parameter estimates when covariate data were MCAR or MAR. When the covariate data were MNAR, the only method resulting in unbiased and precise parameter estimates was a full maximum likelihood modelling method where an extra parameter was estimated, correcting for the unknown missing data mechanism's dependence on the missing data. This thesis presents new insight to the dynamics of missing data in nonlinear mixed effects modelling. Strategies for handling different types of missing data have been developed and compared in order to provide guidance for efficient handling and consequences of inappropriate handling of missing data.
|
Page generated in 0.1398 seconds