Return to search

The statistics of topic modelling.

This research project aims to provide a clear and concise guide to latent dirichlet allocation which is a form of topic modelling. The aim is to help researchers who do not have a strong background in mathematics or statistics to feel comfortable with using topic modelling in their work. In order to achieve this, the thesis provides a step-by-step explanation of how topic modelling works. A range of tools that can be used to perform a topic model analysis are also described. The first chapter gives an
explanation of how topic modelling, and (more specifically), latent dirichlet allocation works; it offers a very basic explanation and then provides an easy to follow mathematical explanation. The second
chapter explains how to perform a topic model analysis; this is done through an explanation of each step used to run a topic model analysis, starting from the type of dataset through to the software packages available to use. The third section provides an example topic model analysis, based on the Philpapers dataset. The final section provides a discussion on the highlights of each chapter and areas for further research.

Identiferoai:union.ndltd.org:canterbury.ac.nz/oai:ir.canterbury.ac.nz:10092/11264
Date January 2015
CreatorsAbey, Rebecca
PublisherUniversity of Canterbury. Mathematics and Statistics
Source SetsUniversity of Canterbury
LanguageEnglish
Detected LanguageEnglish
TypeElectronic thesis or dissertation, Text
RightsCopyright Rebecca Abey, http://library.canterbury.ac.nz/thesis/etheses_copyright.shtml
RelationNZCU

Page generated in 0.0024 seconds