• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Investigation of Multiple Imputation Methods for Categorical Variables

Miranda, Samantha 01 May 2020 (has links)
We compare different multiple imputation methods for categorical variables using the MICE package in R. We take a complete data set and remove different levels of missingness and evaluate the imputation methods for each level of missingness. Logistic regression imputation and linear discriminant analysis (LDA) are used for binary variables. Multinomial logit imputation and LDA are used for nominal variables while ordered logit imputation and LDA are used for ordinal variables. After imputation, the regression coefficients, percent deviation index (PDI) values, and relative frequency tables were found for each imputed data set for each level of missingness and compared to the complete corresponding data set. It was found that logistic regression outperformed LDA for binary variables, and LDA outperformed both multinomial logit imputation and ordered logit imputation for nominal and ordered variables. Simulations were ran to confirm the validity of the results.
2

Performance Comparison of Multiple Imputation Methods for Quantitative Variables for Small and Large Data with Differing Variability

Onyame, Vincent 01 May 2021 (has links)
Missing data continues to be one of the main problems in data analysis as it reduces sample representativeness and consequently, causes biased estimates. Multiple imputation methods have been established as an effective method of handling missing data. In this study, we examined multiple imputation methods for quantitative variables on twelve data sets with varied sizes and variability that were pseudo generated from an original data. The multiple imputation methods examined are the predictive mean matching, Bayesian linear regression and linear regression, non-Bayesian in the MICE (Multiple Imputation Chain Equation) package in the statistical software, R. The parameter estimates generated from the linear regression on the imputed data were compared to the closest parameter estimates from the complete data across all twelve data sets.
3

Comparison of Imputation Methods for Mixed Data Missing at Random

Heidt, Kaitlyn 01 May 2019 (has links)
A statistician's job is to produce statistical models. When these models are precise and unbiased, we can relate them to new data appropriately. However, when data sets have missing values, assumptions to statistical methods are violated and produce biased results. The statistician's objective is to implement methods that produce unbiased and accurate results. Research in missing data is becoming popular as modern methods that produce unbiased and accurate results are emerging, such as MICE in R, a statistical software. Using real data, we compare four common imputation methods, in the MICE package in R, at different levels of missingness. The results were compared in terms of the regression coefficients and adjusted R^2 values using the complete data set. The CART and PMM methods consistently performed better than the OTF and RF methods. The procedures were repeated on a second sample of real data and the same conclusions were drawn.

Page generated in 0.1057 seconds