Return to search

A Preliminary Observation: Can One Linguistic Feature Be the Deterministic Factor for More Accurate Fake News Detection?

This study inspected three linguistic features, specifically the percentage of nouns per sentence, the percentage of verbs per sentence, as well as the mean of dependency distance of the sentence, and observed their respective influence on the fake news classification accuracy. In comparison to the previous studies where linguistic features are combined as a set to be leveraged, this study attempted to untangle the effective individual features from the previously proposed optimal sets. In order to keep the influence of each individual feature independent from the other inspected features, the other feature is held constant in the experiments of observing each target feature. The FEVER dataset is utilized in this study, and the study incorporates the weighted random baselines and Macro F1 scores to mitigate the probable bias caused by the imbalanced distribution of labels in the dataset. GPT-2 and DistilGPT2 models are both fine-tuned to measure the performance gap between the models with different numbers of parameters. The experiment results indicate that the fake news classification accuracy and the features are not always correlated as hypothesized. Nevertheless, having attended to the challenges and limitations imposed by the dataset, this study has paved the way for future studies with similar research purposes. Future works are encouraged to extend the scope and include more linguistic features for the inspection, to eventually achieve more effective fake news classification that leverages only the most relevant features.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-507873
Date January 2023
CreatorsChen, Yini
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0723 seconds