Return to search

The Struggle Against Misinformation: Evaluating the Performance of Basic vs. Complex Machine Learning Models on Manipulated Data

This study investigates the application of machine learning (ML) techniques in detecting fake news, addressing the rapid spread of misinformation across social media platforms. Given the time-consuming nature of manual fact-checking, this research compares the robustness of basic machine learning models, such as Multinominal Naive Bayes classifiers, with complex models like Distil-BERT in identifying fake news. Utilizing datasets including LIAR, ISOT, and GM, this study will evaluate these models based on standard classification metrics both in single domain and cross-domain scenarios, especially when processing linguistically manipulated data. Results indicate that while complex models like Distil-BERT perform better in single-domain classifications, the Baseline models show competitive performance in cross-domain and on the manipulated dataset. However both models struggle with the manipulated dataset, highlighting a critical area for improvement in fake news detection algorithms and methods. In conclusion, the findings suggest that while both basic and complex models have their strength in certain settings, significant advancements are needed to improve against linguistic manipulations, ensuring reliable detection of fake news across varied contexts before consideration of public availability of automated classification.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:su-231048
Date January 2024
CreatorsValladares Parker, Diego Gabriel
PublisherStockholms universitet, Avdelningen för datorlingvistik
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.3429 seconds