Return to search

Estimating the Intrinsic Dimension of High-Dimensional Data Sets: A Multiscale, Geometric Approach

<p>This work deals with the problem of estimating the intrinsic dimension of noisy, high-dimensional point clouds. A general class of sets which are locally well-approximated by <italic>k</italic> dimensional planes but which are embedded in a <italic>D</italic>>><italic>k</italic> dimensional Euclidean space are considered. Assuming one has samples from such a set, possibly corrupted by high-dimensional noise, if the data is linear the dimension can be recovered using PCA. However, when the data is non-linear, PCA fails, overestimating the intrinsic dimension. A multiscale version of PCA is thus introduced which is robust to small sample size, noise, and non-linearities in the data.</p> / Dissertation

Identiferoai:union.ndltd.org:DUKE/oai:dukespace.lib.duke.edu:10161/3863
Date January 2011
CreatorsLittle, Anna Victoria
ContributorsMaggioni, Mauro
Source SetsDuke University
Detected LanguageEnglish
TypeDissertation

Page generated in 0.0019 seconds