Return to search

Multilingual Dependency Parsing of Uralic Languages : Parsing with zero-shot transfer and cross-lingual models using geographically proximate, genealogically related, and syntactically similar transfer languages

One way to improve dependency parsing scores for low-resource languages is to make use of existing resources from other closely related or otherwise similar languages. In this paper, we look at eleven Uralic target languages (Estonian, Finnish, Hungarian, Karelian, Livvi, Komi Zyrian, Komi Permyak, Moksha, Erzya, North Sámi, and Skolt Sámi) with treebanks of varying sizes and select transfer languages based on geographical, genealogical, and syntactic distances. We focus primarily on the performance of parser models trained on various combinations of geographically proximate and genealogically related transfer languages, in target-trained, zero-shot, and cross-lingual configurations. We find that models trained on combinations of geographically proximate and genealogically related transfer languages reach the highest LAS in most zero-shot models, while our highest-performing cross-lingual models were trained on genealogically related languages. We also find that cross-lingual models outperform zero-shot transfer models. We then select syntactically similar transfer languages for three target languages, and find a slight improvement in the case of Hungarian. We discuss the results and conclude with suggestions for possible future work.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-427278
Date January 2020
CreatorsErenmalm, Elsa
PublisherUppsala universitet, Institutionen för lingvistik och filologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.002 seconds