Natural language processing has many important applications in today, such as translations, spam filters, and other useful products. To achieve these applications supervised and unsupervised machine learning models, have shown to be successful. The most important aspect of these models is what the model can achieve with different datasets. This article will examine how RNN models compare with Naive Bayes in text classification. The chosen RNN models are long short-term memory (LSTM) and gated recurrent unit (GRU). Both LSTM and GRU will be trained using the flair Framework. The models will be trained on three separate datasets with different compositions, where the trend within each model will be examined and compared with the other models. The result showed that Naive Bayes performed better on classifying short sentences than the RNN models, but worse in longer sentences. When trained on a small dataset LSTM and GRU had a better result then Naive Bayes. The best performing model was Naive Bayes, which had the highest accuracy score in two out of the three datasets.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-469356 |
Date | January 2022 |
Creators | Jensen, Julius |
Publisher | Uppsala universitet, Institutionen för elektroteknik |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | ELEKTRO-T ; 22003 |
Page generated in 0.0199 seconds