The expectation-maximization (EM) algorithm is a widely utilized approach to max-
imum likelihood estimation in the presence of missing data, this thesis focuses on its
application within the model-based clustering framework. The performance of the
EM algorithm can be highly dependent on how the algorithm is initialized. Several
ways of initializing the EM algorithm have been proposed, however, the best method
to use for initialization remains a somewhat controversial topic. From an attempt to
obtain a superior method of initializing the EM algorithm, comes the concept of using
multiple existing methods together in what will be called a `voting' procedure. This
procedure will use several common initialization methods to cluster the data, then
a nal starting ^zig matrix will be obtained in two ways. The hard `voting' method
follows a majority rule, whereas the soft `voting' method takes an average of the
multiple group memberships. The nal ^zig matrix obtained from both methods will
dictate the starting values of ^ g; ^
g; and ^ g used to initialize the EM algorithm.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OGU.10214/4059 |
Date | 09 October 2012 |
Creators | Dicintio, Sabrina |
Contributors | McNicholas, Paul |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0024 seconds