Discovering the structure of natural language sentences by semi-supervised methods Rudolf Rosa In this thesis, we focus on the problem of automatically syntactically ana- lyzing a language for which there is no syntactically annotated training data. We explore several methods for cross-lingual transfer of syntactic as well as morphological annotation, ultimately based on utilization of bilingual or multi- lingual sentence-aligned corpora and machine translation approaches. We pay particular attention to automatic estimation of the appropriateness of a source language for the analysis of a given target language, devising a novel measure based on the similarity of part-of-speech sequences frequent in the languages. The effectiveness of the presented methods has been confirmed by experiments conducted both by us as well as independently by other respectable researchers. 1
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:387260 |
Date | January 2018 |
Creators | Rosa, Rudolf |
Contributors | Žabokrtský, Zdeněk, Tiedemann, Jörg, Horák, Aleš |
Source Sets | Czech ETDs |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/doctoralThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.002 seconds