Spelling suggestions: "subject:"tokeneer topics"" "subject:"tokenized topics""
1 |
Inference Methods for Token-Level Topic Assignments with Fixed TopicsCowley, Stephen 23 December 2023 (has links) (PDF)
Topic modeling, an unsupervised technique used to gain high-level understanding of a large collection of documents, often involves two major goals: The discovery of topics used in the corpus (topic-discovery) and the assignment of topics to individual words (token-level topic assignment). While Latent Dirichlet Allocation (LDA) normally performs both of these steps simultaneously, some situations require only the token-level topic assignments, using fixed topics. We evaluate three topic assignment strategies using fixed topics -- Gibbs sampling, iterated conditional modes, and mean field variational inference -- to determine which should be used when only token-level topic assignment is needed. Among these methods, we find iterated conditional modes performs best with respect to significance, consistency, and runtime, and variational inference performs best with down-stream classification accuracy.
|
Page generated in 0.0418 seconds