Global ETD Search

Return to search

Classifying Hate Speech using Fine-tuned Language Models

Given the explosion in the size of social media, the amount of hate speech is also growing. To efficiently combat this issue we need reliable and scalable machine learning models. Current solutions rely on crowdsourced datasets that are limited in size, or using training data from self-identified hateful communities, that lacks specificity. In this thesis we introduce a novel semi-supervised modelling strategy. It is first trained on the freely available data from the hateful communities and then fine-tuned to classify hateful tweets from crowdsourced annotated datasets. We show that our model reach state of the art performance with minimal hyper-parameter tuning.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-352637

machine learning

natural language processing

hate speech

transfer learning

semi-supervised learning

recurrent neural networks

Probability Theory and Statistics

Sannolikhetsteori och statistik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-352637
Date	January 2018
Creators	Brorson, Erik
Publisher	Uppsala universitet, Statistiska institutionen
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds

Classifying Hate Speech using Fine-tuned Language Models

Description

Links & Downloads

Tags

Additional Fields