Return to search

Performance Comparison of Imputation Algorithms on Missing at Random Data

Missing data continues to be an issue not only the field of statistics but in any field, that deals with data. This is due to the fact that almost all the widely accepted and standard statistical software and methods assume complete data for all the variables included in the analysis. As a result, in most studies, statistical power is weakened and parameter estimates are biased, leading to weak conclusions and generalizations.
Many studies have established that multiple imputation methods are effective ways of handling missing data. This paper examines three different imputation methods (predictive mean matching, Bayesian linear regression and linear regression, non Bayesian) in the MICE package in the statistical software, R, to ascertain which of the three imputation methods imputes data that yields parameter estimates closest to the parameter estimates of a complete data given different percentages of missingness. In comparing the parameter estimates of the complete data and the imputed data, the parameter estimates in each model were evaluated and compared. The paper extends the analysis by generating a pseudo data of the original data to establish how the imputation methods perform under varying conditions.

Identiferoai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-4839
Date01 May 2018
CreatorsAddo, Evans Dapaa
PublisherDigital Commons @ East Tennessee State University
Source SetsEast Tennessee State University
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceElectronic Theses and Dissertations
RightsCopyright by the authors.

Page generated in 0.0018 seconds