Return to search

All Negative on the Western Front: Analyzing the Sentiment of the Russian News Coverage of Sweden with Generic and Domain-Specific Multinomial Naive Bayes and Support Vector Machines Classifiers / På västfronten intet gott: attitydanalys av den ryska nyhetsrapporteringen om Sverige med generiska och domänspecifika Multinomial Naive Bayes- och Support Vector Machines-klassificerare

This thesis explores to what extent Multinomial Naive Bayes (MNB) and Support Vector Machines (SVM) classifiers can be used to determine the polarity of news, specifically the news coverage of Sweden by the Russian state-funded news outlets RT and Sputnik. Three experiments are conducted.  In the first experiment, an MNB and an SVM classifier are trained with the Large Movie Review Dataset (Maas et al., 2011) with a varying number of samples to determine how training data size affects classifier performance.  In the second experiment, the classifiers are trained with 300 positive, negative, and neutral news articles (Agarwal et al., 2019) and tested on 95 RT and Sputnik news articles about Sweden (Bengtsson, 2019) to determine if the domain specificity of the training data outweighs its limited size.  In the third experiment, the movie-trained classifiers are put up against the domain-specific classifiers to determine if well-trained classifiers from another domain perform better than relatively untrained, domain-specific classifiers.  Four different types of feature sets (unigrams, unigrams without stop words removal, bigrams, trigrams) were used in the experiments. Some of the model parameters (TF-IDF vs. feature count and SVM’s C parameter) were optimized with 10-fold cross-validation.  Other than the superior performance of SVM, the results highlight the need for comprehensive and domain-specific training data when conducting machine learning tasks, as well as the benefits of feature engineering, and to a limited extent, the removal of stop words. Interestingly, the classifiers performed the best on the negative news articles, which made up most of the test set (and possibly of Russian news coverage of Sweden in general).

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-447398
Date January 2021
CreatorsMichel, David
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0027 seconds