Global ETD Search

Return to search

Multilingual Transformer Models for Maltese Named Entity Recognition

The recently developed state-of-the-art models for Named Entity Recognition are heavily dependent upon huge amounts of available annotated data. Consequently, it is extremely challenging for data-scarce languages to obtain significant result. Several approaches have been proposed to circumvent this issue, including cross-lingual transfer learning, which is the leveraging of knowledge obtained by available resources in the source language and transfer it to a target low-resource language. Maltese is one of the many majorly underresourced languages. The main purpose of this project is to research how recently developed transformer multilingual models (Multilingual BERT and XLM-RoBERTa) perform and to ultimately set up an evaluation benchmark in zero-shot cross-lingual transfer learning for Maltese Named Entity Recognition. The models are fine-tuned on Arabic, English, Italian, Spanish and Dutch. The experiments evaluated the efficacy of the source languages and the use of multilingual data in both the training and validation stages. The experiments demonstrated that feeding multilingual data to both the training and the validation phases was mostly beneficial to the performance. However, adding it to the validation phase only was generally detrimental. Furthermore, XLM-R achieved overall better scores however, employing mBERT and English as the source language yielded the best performance.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-479629

low-resource

named-entity

information extraction

Maltese

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-479629
Date	January 2022
Creators	Farrugia, Kris
Publisher	Uppsala universitet, Institutionen för lingvistik och filologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds

Multilingual Transformer Models for Maltese Named Entity Recognition

Description

Links & Downloads

Tags

Additional Fields