Return to search

Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions

Model-based clustering is a probabilistic approach that views each cluster as a component
in an appropriate mixture model. The Gaussian mixture model is one of the
most widely used model-based methods. However, this model tends to perform poorly
when clustering high-dimensional data due to the over-parametrized solutions that
arise in high-dimensional spaces. This work instead considers the approach of combining
dimension reduction techniques with clustering via a mixture of generalized
hyperbolic distributions. The dimension reduction techniques, principal component
analysis and factor analysis along with their extensions were reviewed. Then the aforementioned
dimension reduction techniques were individually paired with the mixture
of generalized hyperbolic distributions in order to demonstrate the clustering performance
achieved under each method using both simulated and real data sets. For a
majority of the data sets, the clustering method utilizing principal component analysis
exhibited better classi cation results compared to the clustering method based
on the extending the factor analysis model. / Thesis / Master of Science (MSc)

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/22758
Date January 2018
CreatorsPathmanathan, Thinesh
ContributorsMcNicholas, Sharon, Statistics
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0029 seconds