Fortnox takes in many errands via their support chat. Some of the questions can be hard to interpret, making it difficult to know where to delegate the question further. It would be beneficial if the process was automated to answer the questions instead of need to put in time to analyze the questions to be able to delegate them. So, the main task is to find an unsupervised model that can take questions and put them into topics. A literature review over NLP and clustering was needed to find the most suitable models and techniques for the problem. Then implementing the models and techniques and evaluating them using support chat questions received by Fortnox. The unsupervised models tested in this thesis were LDA and K-means. The resulting models after training are analyzed, and some of the clusters are given a label. The authors of the thesis give clusters a label after analyzing them by looking at the most relevant words for the cluster. Three different sets of labels are analyzed and tested. The models are evaluated using five different score metrics: Silhouette, AdjustedRand Index, Recall, Precision, and F1 score. K-means scores the best when looking at the score metrics and have an F1 score of 0.417. But can not handle very small documents. LDA does not perform very well and got i F1 score of 0.137 and is not able to categorize documents together.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:lnu-105353 |
Date | January 2021 |
Creators | Andersson, Fredrik, Idemark, Alexander |
Publisher | Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0017 seconds