Return to search

Finding Succinct Representations For Clusters

Improving the explainability of results from machine learning methods has become an important research goal. In this thesis, we have studied the problem of making clusters more interpretable using a recent approach by Davidson et al., and Sambaturu et al., based on succinct representations of clusters. Given a set of objects S, a partition of S (into clusters), and a universe T of descriptors such that each element in S is associated with a subset of descriptors, the goal is to find a representative set of descriptors for each cluster such that those sets are pairwise-disjoint and the total size of all the representatives is at most a given budget. Since this problem is NP-hard in general, Sambaturu et al. have developed a suite of approximation algorithms for the problem. We also show applications to explain clusters of genomic sequences that represent different threat levels / Master of Science / Improving the explainability of results from machine learning methods has become an important research goal. Clustering is a commonly used Machine Learning technique which is performed on a variety of datasets. In this thesis, we have studied the problem of making clusters more interpretable; and have tried to answer whether it is possible to explain clusters using a set of attributes which were not used while generating these clusters.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/91388
Date09 July 2019
CreatorsGupta, Aparna
ContributorsComputer Science, Marathe, Madhav Vishnu, Vullikanti, Anil Kumar S., Swarup, Samarth
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0022 seconds