Missing data continues to be an issue not only the field of statistics but in any field, that deals with data. This is due to the fact that almost all the widely accepted and standard statistical software and methods assume complete data for all the variables included in the analysis. As a result, in most studies, statistical power is weakened and parameter estimates are biased, leading to weak conclusions and generalizations.
Many studies have established that multiple imputation methods are effective ways of handling missing data. This paper examines three different imputation methods (predictive mean matching, Bayesian linear regression and linear regression, non Bayesian) in the MICE package in the statistical software, R, to ascertain which of the three imputation methods imputes data that yields parameter estimates closest to the parameter estimates of a complete data given different percentages of missingness. In comparing the parameter estimates of the complete data and the imputed data, the parameter estimates in each model were evaluated and compared. The paper extends the analysis by generating a pseudo data of the original data to establish how the imputation methods perform under varying conditions.
Identifer | oai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-4839 |
Date | 01 May 2018 |
Creators | Addo, Evans Dapaa |
Publisher | Digital Commons @ East Tennessee State University |
Source Sets | East Tennessee State University |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Electronic Theses and Dissertations |
Rights | Copyright by the authors. |
Page generated in 0.0017 seconds