Return to search

Emotional Content in Novels for Literary Genre Prediction : And Impact of Feature Selection on Text Classification Models

Automatic literary genre classification presents a challenging task for Natural Language Processing (NLP) systems, mainly because literary texts have deeper levels of meanings, hold distinctive themes, and communicate certain messages and emotions. We conduct a study where we experiment with building literary genre classifiers based on emotions in novels, to investigate the effects that features pertinent to emotions have on models of genre prediction. We begin by performing an analysis of emotions describing emotional composition and density in the dataset. The experiments are carried out on a dataset consisting of novels categorized in eight different genres. Genre prediction models are built using three algorithms: Random Forest, Support Vector Machine, and k-Nearest Neighbor. We build models based on emotion-words counts and emotional words in a novel, and compare them to models of commonly used features, the bag-of-words and the TF-IDF features. Moreover, we use a feature selection dimensionality reduction procedure on the TF-IDF feature set and study its impact on classification performance. Finally, we train and test the classifiers on a combination of the two most optimal emotion-related feature sets, and compare them on classifiers trained and tested on a combination of bag-of-words and the reduced TF-IDF features. Our results confirm that: using features of emotional content in novels improves classification performance a 75% F1 compared to a bag-of-words baseline of 71% F1; TF-IDF feature filtering method positively impacts genre classification performance on literary texts.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-447148
Date January 2021
CreatorsYako, Mary
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds