We present INCREMENT, a cluster refinement algorithm which utilizes user feedback to refine clusterings. INCREMENT is capable of improving clusterings produced by arbitrary clustering algorithms. The initial clustering provided is first sub-clustered to improve query efficiency. A small set of select instances from each of these sub-clusters are presented to a user for labelling. Utilizing the user feedback, INCREMENT trains a feature embedder to map the input features to a new feature space. This space is learned such that spatial distance is inversely correlated with semantic similarity, determined from the user feedback. A final clustering is then formed in the embedded space. INCREMENT is tested on 9 datasets initially clustered with 4 distinct clustering algorithms. INCREMENT improved the accuracy of 71% of the initial clusterings with respect to a target clustering. For all the experiments the median percent improvement is 27.3% for V-Measure and is 6.08% for accuracy.
Identifer | oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-6794 |
Date | 01 March 2016 |
Creators | Mitchell, Logan Adam |
Publisher | BYU ScholarsArchive |
Source Sets | Brigham Young University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | All Theses and Dissertations |
Rights | http://lib.byu.edu/about/copyright/ |
Page generated in 0.0019 seconds