Return to search

Stability Selection of the Number of Clusters

Selecting the number of clusters is one of the greatest challenges in clustering analysis. In this thesis, we propose a variety of stability selection criteria based on cross validation for determining the number of clusters. Clustering stability measures the agreement of clusterings obtained by applying the same clustering algorithm on multiple independent and identically distributed samples. We propose to measure the clustering stability by the correlation between two clustering functions. These criteria are motivated by the concept of clustering instability proposed by Wang (2010), which is based on a form of clustering distance. In addition, the effectiveness and robustness of the proposed methods are numerically demonstrated on a variety of simulated and real world samples.

Identiferoai:union.ndltd.org:GEORGIA/oai:digitalarchive.gsu.edu:math_theses-1099
Date18 April 2011
CreatorsReizer, Gabriella v
PublisherDigital Archive @ GSU
Source SetsGeorgia State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceMathematics Theses

Page generated in 0.002 seconds