The aim of this thesis is to examine the cluster analysis ability segment the data set by selected methods. The data sets are consisting of quantitative variables. The basic criterion for the data sets is that the number of classes has to be known and the next criterion is that the membership of all object to each class has to be known too. Execution of the cluster analysis was based on knowledge about the number of classes. Classified objects to individual clusters were compared with its original classes. The output was the relative success of classification by selected methods. Cluster analysis methods are not able to determine an optimal number of clusters. Estimates of the optimal number of clusters were the second step in analysis for each data set. The ability of selected criteria identify the original number of classes was analyzed by comparing numbers of original classes and numbers of optimal clusters. The main contribution of this thesis is the validation of the ability of selected cluster analysis methods to identify similar objects and verify the ability of selected criteria to estimate the number of clusters corresponding to the real file distribution. Moreover, this work provides a structured overview of the basic cluster analysis methods and indicators for estimating the optimal number of clusters.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:114191 |
Date | January 2012 |
Creators | Vanišová, Adéla |
Contributors | Löster, Tomáš, Bílková, Diana |
Publisher | Vysoká škola ekonomická v Praze |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0024 seconds