Text classification using deep learning is rarely applied to tasks with more than ten target classes. This thesis investigates if deep learning can be successfully applied to a task with over 1000 target classes. A pretrained Long Short-Term Memory language model is fine-tuned and used as a base for the classifier. After five days of training, the deep learning model achieves 80.5% accuracy on a publicly available dataset, 9.3% higher than Naive Bayes. With five guesses, the model predicts the correct class 92.2% of the time.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-385162 |
Date | January 2019 |
Creators | Grünwald, Adam |
Publisher | Uppsala universitet, Statistiska institutionen |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0018 seconds