Return to search

Performance of Imputation Algorithms on Artificially Produced Missing at Random Data

Missing data is one of the challenges we are facing today in modeling valid statistical models. It reduces the representativeness of the data samples. Hence, population estimates, and model parameters estimated from such data are likely to be biased.
However, the missing data problem is an area under study, and alternative better statistical procedures have been presented to mitigate its shortcomings. In this paper, we review causes of missing data, and various methods of handling missing data. Our main focus is evaluating various multiple imputation (MI) methods from the multiple imputation of chained equation (MICE) package in the statistical software R. We assess how these MI methods perform with different percentages of missing data. A multiple regression model was fit on the imputed data sets and the complete data set. Statistical comparisons of the regression coefficients are made between the models using the imputed data and the complete data.

Identiferoai:union.ndltd.org:ETSU/oai:dc.etsu.edu:etd-4633
Date01 May 2017
CreatorsOketch, Tobias O
PublisherDigital Commons @ East Tennessee State University
Source SetsEast Tennessee State University
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceElectronic Theses and Dissertations
RightsCopyright by the authors.

Page generated in 0.0033 seconds