Global ETD Search

Return to search

Optimizing t-SNE using random sampling techniques

The main topic of this thesis concerns t-SNE, a dimensionality reduction technique that has gained much popularity for showing great capability of preserving well-separated clusters from a high-dimensional space. Our goal with this thesis is twofold. Firstly we give an introduction to the use of dimensionality reduction techniques in visualization and, following recent research, show that t-SNE in particular is successful at preserving well-separated clusters. Secondly, we perform a thorough series of experiments that give us the ability to draw conclusions about the quality of embeddings from running t-SNE on samples of data using different sampling techniques. We are comparing pure random sampling, random walk sampling and so-called hubness sampling on a dataset, attempting to find a sampling method that is consistently better at preserving local information than simple random sampling. Throughout our testing, a specific variant of random walk sampling distinguished itself as a better alternative to pure random sampling.

http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-88585

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-88585
Date	January 2019
Creators	Buljan, Matej
Publisher	Linnéuniversitetet, Institutionen för matematik (MA)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0017 seconds

Optimizing t-SNE using random sampling techniques

Description

Links & Downloads

Tags

Additional Fields