Spelling suggestions: "subject:"biplot"" "subject:"boxplots""
1 |
Biplots based on principal surfacesGaney, Raeesa 28 April 2020 (has links)
Principal surfaces are smooth two-dimensional surfaces that pass through the middle of a p-dimensional data set. They minimise the distance from the data points, and provide a nonlinear summary of the data. The surfaces are nonparametric and their shape is suggested by the data. The formation of a surface is found using an iterative procedure which starts with a linear summary, typically with a principal component plane. Each successive iteration is a local average of the p-dimensional points, where an average is based on a projection of a point onto the nonlinear surface of the previous iteration. Biplots are considered as extensions of the ordinary scatterplot by providing for more than three variables. When the difference between data points are measured using a Euclidean embeddable dissimilarity function, observations and the associated variables can be displayed on a nonlinear biplot. A nonlinear biplot is predictive if information on variables is added in such a way that it allows the values of the variables to be estimated for points in the biplot. Prediction trajectories, which tend to be nonlinear are created on the biplot to allow information about variables to be estimated. The goal is to extend the idea of nonlinear biplot methodology onto principal surfaces. The ultimate emphasis is on high dimensional data where the nonlinear biplot based on a principal surface allows for visualisation of samples, variable trajectories and predictive sets of contour lines. The proposed biplot provides more accurate predictions, with an additional feature of visualising the extent of nonlinearity that exists in the data.
|
2 |
PCA and CVA biplots : a study of their underlying theory and quality measuresBrand, Hilmarie 03 1900 (has links)
Thesis (MComm)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: The main topics of study in this thesis are the Principal Component Analysis (PCA)
and Canonical Variate Analysis (CVA) biplots, with the primary focus falling on the
quality measures associated with these biplots. A detailed study of different routes
along which PCA and CVA can be derived precedes the study of the PCA biplot
and CVA biplot respectively. Different perspectives on PCA and CVA highlight
different aspects of the theory that underlie PCA and CVA biplots respectively and
so contribute to a more solid understanding of these biplots and their interpretation.
PCA is studied via the routes followed by Pearson (1901) and Hotelling (1933).
CVA is studied from the perspectives of Linear Discriminant Analysis, Canonical
Correlation Analysis as well as a two-step approach introduced in Gower et al.
(2011). The close relationship between CVA and Multivariate Analysis of Variance
(MANOVA) also receives some attention.
An explanation of the construction of the PCA biplot is provided subsequent to
the study of PCA. Thereafter follows an in depth investigation of quality measures of
the PCA biplot as well as the relationships between these quality measures. Specific
attention is given to the effect of standardisation on the PCA biplot and its quality
measures.
Following the study of CVA is an explanation of the construction of the weighted
CVA biplot as well as two different unweighted CVA biplots based on the two-step
approach to CVA. Specific attention is given to the effect of accounting for group sizes
in the construction of the CVA biplot on the representation of the group structure
underlying a data set. It was found that larger groups tend to be better separated
from other groups in the weighted CVA biplot than in the corresponding unweighted
CVA biplots. Similarly it was found that smaller groups tend to be separated to
a greater extent from other groups in the unweighted CVA biplots than in the
corresponding weighted CVA biplot.
A detailed investigation of previously defined quality measures of the CVA biplot
follows the study of the CVA biplot. It was found that the accuracy with which the
group centroids of larger groups are approximated in the weighted CVA biplot is
usually higher than that in the corresponding unweighted CVA biplots. Three new
quality measures that assess that accuracy of the Pythagorean distances in the CVA
biplot are also defined. These quality measures assess the accuracy of the Pythagorean
distances between the group centroids, the Pythagorean distances between the
individual samples and the Pythagorean distances between the individual samples
and group centroids in the CVA biplot respectively. / AFRIKAANSE OPSOMMING: Die hoofonderwerpe van studie in hierdie tesis is die Hoofkomponent Analise (HKA)
bistipping asook die Kanoniese Veranderlike Analise (KVA) bistipping met die primêre
fokus op die kwaliteitsmaatstawwe wat daarmee geassosieer word. ’n Gedetailleerde
studie van verskillende roetes waarlangs HKA en KVA afgelei kan word,
gaan die studie van die HKA en KVA bistippings respektiewelik vooraf. Verskillende
perspektiewe op HKA en KVA belig verskillende aspekte van die teorie wat
onderliggend is tot die HKA en KVA bistippings respektiewelik en dra sodoende by
tot ’n meer breedvoerige begrip van hierdie bistippings en hulle interpretasies. HKA
word bestudeer volgens die roetes wat gevolg is deur Pearson (1901) en Hotelling
(1933). KVA word bestudeer vanuit die perspektiewe van Linieêre Diskriminantanalise,
Kanoniese Korrelasie-analise sowel as ’n twee-stap-benadering soos voorgestel in
Gower et al. (2011). Die noue verwantskap tussen KVA en Meerveranderlike Analise
van Variansie (MANOVA) kry ook aandag.
’n Verduideliking van die konstruksie van die HKA bistipping word voorsien na
afloop van die studie van HKA. Daarna volg ’n indiepte-ondersoek van die HKA
bistipping kwaliteitsmaatstawwe sowel as die onderlinge verhoudings tussen hierdie
kwaliteitsmaatstawe. Spesifieke aandag word gegee aan die effek van die standaardisasie
op die HKA bistipping en sy kwaliteitsmaatstawe.
Opvolgend op die studie van KVA is ’n verduideliking van die konstruksie van
die geweegde KVA bistipping sowel as twee veskillende ongeweegde KVA bistippings
gebaseer op die twee-stap-benadering tot KVA. Spesifieke aandag word gegee aan
die effek wat die inagneming van die groepsgroottes in die konstruksie van die KVA
bistipping op die voorstelling van die groepstruktuur onderliggend aan ’n datastel
het. Daar is gevind dat groter groepe beter geskei is van ander groepe in die geweegde
KVA bistipping as in die oorstemmende ongeweegde KVA bistipping. Soortgelyk
daaraan is gevind dat kleiner groepe tot ’n groter mate geskei is van ander groepe in
die ongeweegde KVA bistipping as in die oorstemmende geweegde KVA bistipping.
’n Gedetailleerde ondersoek van voorheen gedefinieerde kwaliteitsmaatstawe van
die KVA bistipping volg op die studie van die KVA bistipping. Daar is gevind
dat die akkuraatheid waarmee die groepsgemiddeldes van groter groepe benader
word in die geweegde KVA bistipping, gewoonlik hoër is as in die ooreenstemmende
ongeweegde KVA bistippings. Drie nuwe kwaliteitsmaatstawe wat die akkuraatheid
van die Pythagoras-afstande in die KVA bistipping meet, word gedefinieer. Hierdie
kwaliteitsmaatstawe beskryf onderskeidelik die akkuraatheid van die voorstelling
van die Pythagoras-afstande tussen die groepsgemiddeldes, die Pythagoras-afstande
tussen die individuele observasies en die Pythagoras-afstande tussen die individuele
observasies en groepsgemiddeldes in die KVA bistipping.
|
Page generated in 0.046 seconds