The study aims to estimate the ability of different grouping techniques on categorical response. We try to find out how well do they work? Do they really find clusters when clusters exist? We use Cancer Problems in Living Scales from the ACS as our categorical data variables and lung cancer survivors as our studying group. Five methods of cluster analysis are examined for their accuracy in clustering on both real CPILS dataset and simulated data. The methods include hierarchical cluster analysis (Ward's method), model-based clustering of raw data, model-based clustering of the factors scores from a maximum likelihood factor analysis, model-based clustering of the predicted scores from independent factor analysis, and the method of latent class clustering. The results from each of the five methods are then compared to actual classifications. The performance of model-based clustering on raw data is poorer than that of the other methods and the latent class clustering method is most appropriate for the specific categorical data examined. These results are discussed and recommendations are made regarding future directions for cluster analysis research.
Identifer | oai:union.ndltd.org:GEORGIA/oai:digitalarchive.gsu.edu:math_theses-1049 |
Date | 22 April 2008 |
Creators | Guo, Ling |
Publisher | Digital Archive @ GSU |
Source Sets | Georgia State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Mathematics Theses |
Page generated in 0.0019 seconds