Return to search

Semi-supervised Sentiment Analysis for Sentence Classification

In our work, we deploy semi-supervised learning methods to perform Sentiment Analysis on a corpus of sentences, meant to be labeled as either happy, neutral, sad, or angry. Sentence-BERT is used to obtain high-dimensional embeddings for the sentences in the training and testing sets, on which three classification methods are applied: the K-Nearest Neighbors classifier (KNN), Label Propagation, and Label Spreading. The latter two are graph-based classifying methods that are expected to provide better predictions compared to the supervised KNN, due to their ability to propagate labels of known data to similar (and spatially close) unknown data. In our study, we experiment with multiple combinations of labeled and unlabeled data, various hyperparameters, and 4 distinct classes of data, and we perform both binary and fine-grained classification tasks. A custom Radial Basis Function kernel is created for this study, in which Euclidean distance is replaced with Cosine Similarity, in order to correspond to the metric used in SentenceBERT. It is found that, for 2 out of 4 tasks, and more specifically 3-class and 2-class classification, the two graph-based algorithms outperform the chosen baseline, although the scores are not significantly higher. The supervised KNN classifier performs better for the second 3-class classification, as well as the 4-class classification, especially when using embeddings of lower dimensionality. The conclusions drawn from the results are, firstly, that the dataset used is most likely not quite suitable for graph creation, and, secondly, that larger volumes of labeled data should be used for further interpretation.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-478119
Date January 2022
CreatorsTsakiri, Eirini
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds