Return to search

PCA and CVA biplots : a study of their underlying theory and quality measures

Thesis (MComm)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: The main topics of study in this thesis are the Principal Component Analysis (PCA)
and Canonical Variate Analysis (CVA) biplots, with the primary focus falling on the
quality measures associated with these biplots. A detailed study of different routes
along which PCA and CVA can be derived precedes the study of the PCA biplot
and CVA biplot respectively. Different perspectives on PCA and CVA highlight
different aspects of the theory that underlie PCA and CVA biplots respectively and
so contribute to a more solid understanding of these biplots and their interpretation.
PCA is studied via the routes followed by Pearson (1901) and Hotelling (1933).
CVA is studied from the perspectives of Linear Discriminant Analysis, Canonical
Correlation Analysis as well as a two-step approach introduced in Gower et al.
(2011). The close relationship between CVA and Multivariate Analysis of Variance
(MANOVA) also receives some attention.
An explanation of the construction of the PCA biplot is provided subsequent to
the study of PCA. Thereafter follows an in depth investigation of quality measures of
the PCA biplot as well as the relationships between these quality measures. Specific
attention is given to the effect of standardisation on the PCA biplot and its quality
measures.
Following the study of CVA is an explanation of the construction of the weighted
CVA biplot as well as two different unweighted CVA biplots based on the two-step
approach to CVA. Specific attention is given to the effect of accounting for group sizes
in the construction of the CVA biplot on the representation of the group structure
underlying a data set. It was found that larger groups tend to be better separated
from other groups in the weighted CVA biplot than in the corresponding unweighted
CVA biplots. Similarly it was found that smaller groups tend to be separated to
a greater extent from other groups in the unweighted CVA biplots than in the
corresponding weighted CVA biplot.
A detailed investigation of previously defined quality measures of the CVA biplot
follows the study of the CVA biplot. It was found that the accuracy with which the
group centroids of larger groups are approximated in the weighted CVA biplot is
usually higher than that in the corresponding unweighted CVA biplots. Three new
quality measures that assess that accuracy of the Pythagorean distances in the CVA
biplot are also defined. These quality measures assess the accuracy of the Pythagorean
distances between the group centroids, the Pythagorean distances between the
individual samples and the Pythagorean distances between the individual samples
and group centroids in the CVA biplot respectively. / AFRIKAANSE OPSOMMING: Die hoofonderwerpe van studie in hierdie tesis is die Hoofkomponent Analise (HKA)
bistipping asook die Kanoniese Veranderlike Analise (KVA) bistipping met die primêre
fokus op die kwaliteitsmaatstawwe wat daarmee geassosieer word. ’n Gedetailleerde
studie van verskillende roetes waarlangs HKA en KVA afgelei kan word,
gaan die studie van die HKA en KVA bistippings respektiewelik vooraf. Verskillende
perspektiewe op HKA en KVA belig verskillende aspekte van die teorie wat
onderliggend is tot die HKA en KVA bistippings respektiewelik en dra sodoende by
tot ’n meer breedvoerige begrip van hierdie bistippings en hulle interpretasies. HKA
word bestudeer volgens die roetes wat gevolg is deur Pearson (1901) en Hotelling
(1933). KVA word bestudeer vanuit die perspektiewe van Linieêre Diskriminantanalise,
Kanoniese Korrelasie-analise sowel as ’n twee-stap-benadering soos voorgestel in
Gower et al. (2011). Die noue verwantskap tussen KVA en Meerveranderlike Analise
van Variansie (MANOVA) kry ook aandag.
’n Verduideliking van die konstruksie van die HKA bistipping word voorsien na
afloop van die studie van HKA. Daarna volg ’n indiepte-ondersoek van die HKA
bistipping kwaliteitsmaatstawwe sowel as die onderlinge verhoudings tussen hierdie
kwaliteitsmaatstawe. Spesifieke aandag word gegee aan die effek van die standaardisasie
op die HKA bistipping en sy kwaliteitsmaatstawe.
Opvolgend op die studie van KVA is ’n verduideliking van die konstruksie van
die geweegde KVA bistipping sowel as twee veskillende ongeweegde KVA bistippings
gebaseer op die twee-stap-benadering tot KVA. Spesifieke aandag word gegee aan
die effek wat die inagneming van die groepsgroottes in die konstruksie van die KVA
bistipping op die voorstelling van die groepstruktuur onderliggend aan ’n datastel
het. Daar is gevind dat groter groepe beter geskei is van ander groepe in die geweegde
KVA bistipping as in die oorstemmende ongeweegde KVA bistipping. Soortgelyk
daaraan is gevind dat kleiner groepe tot ’n groter mate geskei is van ander groepe in
die ongeweegde KVA bistipping as in die oorstemmende geweegde KVA bistipping.
’n Gedetailleerde ondersoek van voorheen gedefinieerde kwaliteitsmaatstawe van
die KVA bistipping volg op die studie van die KVA bistipping. Daar is gevind
dat die akkuraatheid waarmee die groepsgemiddeldes van groter groepe benader
word in die geweegde KVA bistipping, gewoonlik hoër is as in die ooreenstemmende
ongeweegde KVA bistippings. Drie nuwe kwaliteitsmaatstawe wat die akkuraatheid
van die Pythagoras-afstande in die KVA bistipping meet, word gedefinieer. Hierdie
kwaliteitsmaatstawe beskryf onderskeidelik die akkuraatheid van die voorstelling
van die Pythagoras-afstande tussen die groepsgemiddeldes, die Pythagoras-afstande
tussen die individuele observasies en die Pythagoras-afstande tussen die individuele
observasies en groepsgemiddeldes in die KVA bistipping.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/80363
Date03 1900
CreatorsBrand, Hilmarie
ContributorsLe Roux, N. J., Lubbe, S., Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Languageen_ZA
Detected LanguageUnknown
TypeThesis
Format347 p.
RightsStellenbosch University

Page generated in 0.0031 seconds