Global ETD Search

Return to search

Transfer Learning for Automatic Author Profiling with BERT Transformers and GloVe Embeddings

Historically author profiling has been used in forensic linguistics. However, it is not until the last decades that the analysis method has worked into computer science and machine learning. In comparison, determining author profiling characteristics in machine learning is nothing new. This paper investigates the possibility to improve upon previous results with modern frameworks using data sets that have seen limited usage. The purpose of this master thesis was to use pre-trained transformers or embeddings together with transfer learning. In addition, to examine if general author profiling characteristics of anonymous users on internet forums or conversations on social media could be determined. The data sets used to investigate the questions above were PAN15 and PANDORA, which contains various properties in text data based on authors paired with ground truth labels such as gender, age, and Big Five/OCEAN. In addition, transfer learning of BERT and GloVe was used as a starting point to decrease the learning time of a new task. PAN15, a Twitter data set, did not contain enough data when training a model and was augmented using PANDORA, a Reddit-based data set. Ultimately, BERT obtained the best performance using a stacked approach, achieving 86 − 91% accuracy for each label on unseen data.

http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-89737

machine learning

natural language processing

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ltu-89737
Date	January 2022
Creators	From, Viktor
Publisher	Luleå tekniska universitet, Institutionen för system- och rymdteknik
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.002 seconds

Transfer Learning for Automatic Author Profiling with BERT Transformers and GloVe Embeddings

Description

Links & Downloads

Tags

Additional Fields