Return to search

Bi-filtration and stability of TDA mapper for point cloud data

TDA mapper is an algorithm used to visualize and analyze big data. TDA mapper is applied to a dataset, X, equipped with a filter function f from X to R. The output of the algorithm is an abstract graph (or simplicial complex). The abstract graph captures topological and geometric information of the underlying space of X.
One of the interests in TDA mapper is to study whether or not a mapper graph is stable. That is, if a dataset X is perturbed by a small value, and denote the perturbed dataset by X∂, we would like to compare the TDA mapper graph of X to the TDA mapper graph of X∂. Given a topological space X, if the cover of the image of f satisfies certain conditions, Tamal Dey, Facundo Memoli, and Yusu Wang proved that the TDA mapper is stable. That is, the mapper graph of X differs from the mapper graph of X∂ by a small value measured via homology.
The goal of this thesis is three-fold. The first is to introduce a modified TDA mapper algorithm. The fundamental difference between TDA mapper and the modified version is the modified version avoids the use of filter function. In comparing the mapper graph outputs, the proposed modified mapper is shown to capture more geometric and topological features. We discuss the advantages and disadvantages of the modified mapper.
Tamal Dey, Facundo Memoli, and Yusu Wang showed that a filtration of covers induce a filtration of simplicial complexes, which in turn induces a filtration of homology groups. While Tamal Dey, Facundo Memoli, and Yusu Wang focused on TDA mapper's application to topological space, the second goal of this thesis is to show DBSCAN clustering gives a filtration of covers when TDA mapper is applied to a point cloud. Hence, DBSCAN gives a filtration of mapper graphs (simplicial complexes) and homology groups. More importantly, DBSCAN gives a filtration of covers, mapper graphs, and homology groups in three parameter directions: bin size, epsilon, and Minpts. Hence, there is a multi-dimensional filtration of covers, mapper graphs, and homology groups. We also note that single-linkage clustering is a special case of DBSCAN clustering, so the results proved to be true when DBSCAN is used are also true when single-linkage is used. However, complete-linkage does not give a filtration of covers in the direction of bin, hence no filtration of simpicial complexes and homology groups exist when complete-linkage is applied to cluster a dataset. In general, the results hold for any clustering algorithm that gives a filtration of covers.
The third (and last) goal of this thesis is to prove that two multi-dimensional persistence modules (one: with respect to the original dataset, X; two: with respect to the ∂-perturbation of X) are 2∂-interleaved. In other words, the mapper graphs of X and X∂ differ by a small value as measured by homology.

Identiferoai:union.ndltd.org:uiowa.edu/oai:ir.uiowa.edu:etd-8330
Date01 August 2019
CreatorsBungula, Wako Tasisa
ContributorsDarcy, Isabel K.
PublisherUniversity of Iowa
Source SetsUniversity of Iowa
LanguageEnglish
Detected LanguageEnglish
Typedissertation
Formatapplication/pdf
SourceTheses and Dissertations
RightsCopyright © 2019 Wako Tasisa Bungula

Page generated in 0.0019 seconds