Return to search

Analysis of Healthcare Coverage Using Data Mining Techniques

This study explores healthcare coverage disparity using a quantitative analysis on a large dataset from the United States. One of the objectives is to build supervised models including decision tree and neural network to study the efficient factors in healthcare coverage. We also discover groups of people with health coverage problems and inconsistencies by employing unsupervised modeling including K-Means clustering algorithm.
Our modeling is based on the dataset retrieved from Medical Expenditure Panel Survey with 98,175 records in the original dataset. After pre-processing the data, including binning, cleaning, dealing with missing values, and balancing, it contains 26,932 records and 23 variables. We build 50 classification models in IBM SPSS Modeler employing decision tree and neural networks. The accuracy of the models varies between 76% and 81%. The models can predict the healthcare coverage for a new sample based on its significant attributes. We demonstrate that the decision tree models provide higher accuracy that the models based on neural networks. Also, having extensively analyzed the results, we discover the most efficient factors in healthcare coverage to be: access to care, age, poverty level of family, and race/ethnicity.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OOU.#10393/20547
Date12 January 2012
CreatorsTekieh, Mohammad Hossein
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeThèse / Thesis

Page generated in 0.002 seconds