Dimensionality reduction techniques such as t-SNE and UMAP are useful both for overview of high-dimensional datasets and as part of a machine learning pipeline. These techniques create a non-parametric model of the manifold by fitting a density kernel about each data point using the distances to its k-nearest neighbors. In dense regions, this approach works well, but in sparse regions, it tends to draw unrelated points into the nearest cluster. Our work focuses on a homotopy method which imposes graph-based regularization over the manifold parameters to update the embedding. As the homotopy parameter increases, so does the cost of modeling different scales between adjacent neighborhoods. This gradually imposes a more uniform scale over the manifold, resulting in a more faithful embedding which preserves structure in dense areas while pushing sparse anomalous points outward.
Identifer | oai:union.ndltd.org:wpi.edu/oai:digitalcommons.wpi.edu:etd-theses-2399 |
Date | 07 May 2020 |
Creators | Beach, David J. |
Contributors | Randy C. Paffenroth, Advisor |
Publisher | Digital WPI |
Source Sets | Worcester Polytechnic Institute |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Masters Theses (All Theses, All Years) |
Page generated in 0.0017 seconds