Cluster and discriminant analysis belong to basic classification methods. Using cluster analysis can be a disordered group of objects organized into several internally homogeneous classes or clusters. Discriminant analysis creates knowledge based on the jurisdiction of existing classes classification rule, which can be then used for classifying units with an unknown group membership. The aim of this thesis is a comparison of discriminant analysis and different methods of cluster analysis. To reflect the distances between objects within each cluster, squeared Euclidean and Mahalanobis distances are used. In total, there are 28 datasets analyzed in this thesis. In case of leaving correlated variables in the set and applying squared Euclidean distance, Ward´s method classified objects into clusters the most successfully (42,0 %). After changing metrics on the Mahalanobis distance, the most successful method has become the furthest neighbor method (37,5 %). After removing highly correlated variables and applying methods with Euclidean metric, Ward´s method was again the most successful in classification of objects (42,0%). From the result implies that cluster analysis is more precise when excluding correlated variables than when leaving them in a dataset. The average result of discriminant analysis for data with correlated variables and also without correlated variables is 88,7 %.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:201591 |
Date | January 2015 |
Creators | Rynešová, Pavlína |
Contributors | Löster, Tomáš, Řezanková, Hana |
Publisher | Vysoká škola ekonomická v Praze |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0016 seconds