Global ETD Search

Return to search

Unsupervised topic modeling for customer support chat : Comparing LDA and K-means

Fortnox takes in many errands via their support chat. Some of the questions can be hard to interpret, making it difficult to know where to delegate the question further. It would be beneficial if the process was automated to answer the questions instead of need to put in time to analyze the questions to be able to delegate them. So, the main task is to find an unsupervised model that can take questions and put them into topics. A literature review over NLP and clustering was needed to find the most suitable models and techniques for the problem. Then implementing the models and techniques and evaluating them using support chat questions received by Fortnox. The unsupervised models tested in this thesis were LDA and K-means. The resulting models after training are analyzed, and some of the clusters are given a label. The authors of the thesis give clusters a label after analyzing them by looking at the most relevant words for the cluster. Three different sets of labels are analyzed and tested. The models are evaluated using five different score metrics: Silhouette, AdjustedRand Index, Recall, Precision, and F1 score. K-means scores the best when looking at the score metrics and have an F1 score of 0.417. But can not handle very small documents. LDA does not perform very well and got i F1 score of 0.137 and is not able to categorize documents together.

http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-105353

LDA

K-means

Topic modeling

Natural Language Processing

clustering

customer support

unsupervised machine learning

Computer Sciences

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-105353
Date	January 2021
Creators	Andersson, Fredrik, Idemark, Alexander
Publisher	Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM)
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0017 seconds

Unsupervised topic modeling for customer support chat : Comparing LDA and K-means

Description

Links & Downloads

Tags

Additional Fields