Return to search

Tvorba závislostního korpusu pro jorubštinu s využitím paralelních dat / Tvorba závislostního korpusu pro jorubštinu s využitím paralelních dat

The goal of this thesis is to create a dependency treebank for Yorùbá, a language with very little pre-existing machine-readable resources. The treebank follows the Universal Dependencies (UD) annotation standard, certain language-specific guidelines for Yorùbá were specified. Known techniques for porting resources from resource-rich languages were tested, in particular projection of annotation across parallel bilingual data. Manual annotation is not the main focus of this thesis; nevertheless, a small portion of the data was verified manually in order to evaluate the annotation quality. Also, a model was trained on the manual annotation using UDPipe.

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:387891
Date January 2018
CreatorsOluokun, Adedayo
ContributorsZeman, Daniel, Rosa, Rudolf
Source SetsCzech ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0021 seconds