Return to search

Syntaktická analýza textů se střídáním kódů / Syntaktická analýza textů se střídáním kódů

(English) Vinit Ravishankar July 2018 The aim of this thesis is twofold; first, we attempt to dependency parse existing code-switched corpora, solely by training on monolingual dependency treebanks. In an attempt to do so, we design a dependency parser and ex- periment with a variety of methods to improve upon the baseline established by raw training on monolingual treebanks: these methods range from treebank modification to network modification. On this task, we obtain state-of-the- art results for most evaluation criteria on the task for our evaluation language pairs: Hindi/English and Komi/Russian. We beat our own baselines by a sig- nificant margin, whilst simultaneously beating most scores on similar tasks in the literature. The second part of the thesis involves introducing the relatively understudied task of predicting code-switching points in a monolingual utter- ance; we provide several architectures that attempt to do so, and provide one of them as our baseline, in the hopes that it should continue as a state-of-the-art in future tasks. 1

Identiferoai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:387870
Date January 2018
CreatorsRavishankar, Vinit
ContributorsZeman, Daniel, Mareček, David
Source SetsCzech ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/masterThesis
Rightsinfo:eu-repo/semantics/restrictedAccess

Page generated in 0.0018 seconds