Global ETD Search

Return to search

Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions

Model-based clustering is a probabilistic approach that views each cluster as a component
in an appropriate mixture model. The Gaussian mixture model is one of the
most widely used model-based methods. However, this model tends to perform poorly
when clustering high-dimensional data due to the over-parametrized solutions that
arise in high-dimensional spaces. This work instead considers the approach of combining
dimension reduction techniques with clustering via a mixture of generalized
hyperbolic distributions. The dimension reduction techniques, principal component
analysis and factor analysis along with their extensions were reviewed. Then the aforementioned
dimension reduction techniques were individually paired with the mixture
of generalized hyperbolic distributions in order to demonstrate the clustering performance
achieved under each method using both simulated and real data sets. For a
majority of the data sets, the clustering method utilizing principal component analysis
exhibited better classi cation results compared to the clustering method based
on the extending the factor analysis model. / Thesis / Master of Science (MSc)

http://hdl.handle.net/11375/22758

Identifer	oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/22758
Date	January 2018
Creators	Pathmanathan, Thinesh
Contributors	McNicholas, Sharon, Statistics
Source Sets	McMaster University
Language	English
Detected Language	English
Type	Thesis

Page generated in 0.002 seconds

Dimension Reduction and Clustering of High Dimensional Data using a Mixture of Generalized Hyperbolic Distributions

Description

Links & Downloads

Tags

Additional Fields