Global ETD Search

Return to search

Pachinko allocation: DAG-structured mixture models of topic correlations

Statistical topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, the majority of existing approaches capture no or limited correlations between topics. We propose the pachinko allocation model (PAM), which captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). We present various structures within this framework, different parameterizations of topic distributions, and an extension to capture dynamic patterns of topic correlations. We also introduce a non-parametric Bayesian prior to automatically learn the topic structure from data. The model is evaluated on document classification, likelihood of held-out data, the ability to support fine-grained topics, and topical keyword coherence. With a highly-scalable approximation, PAM has also been applied to discover topic hierarchies in very large datasets.

https://scholarworks.umass.edu/dissertations/AAI3289214

Identifer	oai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-4826
Date	01 January 2007
Creators	Li, Wei
Publisher	ScholarWorks@UMass Amherst
Source Sets	University of Massachusetts, Amherst
Language	English
Detected Language	English
Type	text
Source	Doctoral Dissertations Available from Proquest

Page generated in 0.0017 seconds

Pachinko allocation: DAG-structured mixture models of topic correlations

Description

Links & Downloads

Tags

Additional Fields