Spelling suggestions: "subject:"4cluster analysis. 4cluster analysis"" "subject:"4cluster analysis. bycluster analysis""
1 |
From scenario association to categorical data clustering /Pan, Yuanyi. January 2005 (has links)
Thesis (M.Sc.)--York University, 2005. Graduate Programme in Mathematics and Statistics. / Typescript. Includes bibliographical references (leaves 61-62). Also available on the Internet. MODE OF ACCESS via web browser by entering the following URL: http://gateway.proquest.com/openurl?url%5Fver=Z39.88-2004&res%5Fdat=xri:pqdiss &rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&rft_dat=xri:pqdiss:MR11874
|
2 |
Hodnocení úspěšnosti koeficientů pro stanovení optimálního počtu shluků ve shlukové analýze / The evaluation of coefficients when determining the optimal number of clusters in cluster analysisNovák, Miroslav January 2014 (has links)
The objective of this thesis is the evaluation of selected coefficients of the cluster analysis when determining the optimal number of clusters. The analytical evaluation is performed on 20 independent real datasets. The analysis is made in statistical SYSTAT 13.1 Software. The application of coefficients RMSSTD, CHF, PTS, DB and Dunn's index on real datasets is the main part of this thesis, because the issue of evaluating the results of clustering is not devoted sufficient attention in scientific publications. The main goal is whether the selected coefficients of clustering can be applied in the real situations. The second goal is to compare selected clustering methods and their corresponding metrics when determining the optimal number of clusters. In conclusion, it is found that the optimal number of clusters determined by the coefficients mentioned above cannot be considered to be correct since, after application to the real data, none of the selected coefficients overcome the success rate of 40%, hence, the use of these coefficients in practice is very limited. Based on the practical analysis, the best method in identifying the known number of clusters is the average linkage in connection with the Euclidean distance, while the worst is the Ward's method in connection with the Euclidean distance.
|
3 |
Shluková analýza jako nástroj klasifikace objektů / Cluster analysis as a tool for object classificationVanišová, Adéla January 2012 (has links)
The aim of this thesis is to examine the cluster analysis ability segment the data set by selected methods. The data sets are consisting of quantitative variables. The basic criterion for the data sets is that the number of classes has to be known and the next criterion is that the membership of all object to each class has to be known too. Execution of the cluster analysis was based on knowledge about the number of classes. Classified objects to individual clusters were compared with its original classes. The output was the relative success of classification by selected methods. Cluster analysis methods are not able to determine an optimal number of clusters. Estimates of the optimal number of clusters were the second step in analysis for each data set. The ability of selected criteria identify the original number of classes was analyzed by comparing numbers of original classes and numbers of optimal clusters. The main contribution of this thesis is the validation of the ability of selected cluster analysis methods to identify similar objects and verify the ability of selected criteria to estimate the number of clusters corresponding to the real file distribution. Moreover, this work provides a structured overview of the basic cluster analysis methods and indicators for estimating the optimal number of clusters.
|
4 |
Hodnocení výsledků metod shlukové analýzy / Evaluation of Cluster Analysis MethodsLöster, Tomáš January 2004 (has links)
Cluster analysis includes a range of methods and practices that are used primarily for classification of objects. It takes an important role in many areas. Since the resulting distribution of objects into clusters may vary depending on the selected methods and specifications, it is appropriate to assess the results obtained. This paper proposes new ways of evaluating these results in a situation where objects are characterized by qualitative variables or by variables of different types. These coefficients can be used either to compare different methods (in terms of better outcomes) or for finding of the optimal number of clusters. All of them are based on the detection of variability which is also used for measuring of dissimilarity of objects and clusters. The newly proposed evaluation methods are applied to real data sets (of different sizes, with different number of variables, including variables of different types) and the behavior of these coefficients in different conditions is being examined. These data sets have known as well as unknown classification of objects into clusters. The best coefficient for evaluating clustering results with different types of variables can be considered, based on the analysis carried out, the modified coefficient of CHF. Local maximum value according to which the results of the clustering are evaluated, almost always exists. The analysis has proven that in most cases this value meets the expected results of the well-known classification of objects into clusters. The existence of local extremes of the other coefficients depends on specific data sets and is not always feasible.
|
Page generated in 0.1141 seconds