Return to search

Multiscale principal component analysis

The problem of approximating multidimensional data with objects of lower dimension is a classical problem in complexity reduction. It is important that data approximation capture the structure(s) and dynamics of the data, however distortion to data by many methods during approximation implies that some geometric structure(s) of the data may not be preserved during data approximation. For methods that model the manifold of the data, the quality of approximation depends crucially on the initialization of the method. The first part of this thesis investigates the effect of initialization on manifold modelling methods. Using Self Organising Maps (SOM) as a case study, we compared the quality of learning of manifold methods for two popular initialization methods; random initialization and principal component initialization. To further understand the dynamics of manifold learning, datasets were further classified into linear, quasilinear and nonlinear. The second part of this thesis focuses on revealing geometric structure(s) in high dimension data using an extension of Principal Component Analysis (PCA). Feature extraction using (PCA) favours direction with large variance which could obfuscate other interesting geometric structure(s) that could be present in the data. To reveal these intrinsic structures, we analysed the local PCA structures of the dataset. An equivalent definition of PCA is that it seeks subspaces that maximize the sum of pairwise distances of data projection; extending this definition we define localization in term of scale as maximizing the sum of weighted squared pairwise distances between data projections for various distributions of weights (scales). Since for complex data various regions of the dataspace could have different PCA structures, we also define localization with regards to dataspace. The resulting local PCA structures were represented by the projection matrix corresponding to the subspaces and analysed to reveal some structures in the data at various localizations.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:679861
Date January 2016
CreatorsAkinduko, Ayodeji Akinwumi
ContributorsGorban, Alexander
PublisherUniversity of Leicester
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/2381/36616

Page generated in 0.002 seconds