This paper explores topic modeling through the example text of Alice in Wonderland. It explores both singular value decomposition as well as non-‐‑negative matrix factorization as methods for feature extraction. The paper goes on to explore methods for partially supervised implementation of topic modeling through introducing themes. A large portion of the paper also focuses on implementation of these techniques in python as well as visualizations of the results which use a combination of python, html and java script along with the d3 framework. The paper concludes by presenting a mixture of SVD, NMF and partially-‐‑supervised NMF as a possible way to improve topic modeling.
Identifer | oai:union.ndltd.org:CLAREMONT/oai:scholarship.claremont.edu:cmc_theses-2795 |
Date | 01 January 2018 |
Creators | Smith, Sydney |
Publisher | Scholarship @ Claremont |
Source Sets | Claremont Colleges |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | CMC Senior Theses |
Rights | © 2017 Sydney Smith, default |
Page generated in 0.0018 seconds