Global ETD Search

Return to search

Learning Morphology for Open-Vocabulary Neural Machine Translation

State-of-the-art neural machine translation systems typically have low accuracy in translating rare or unseen words due to the requirement of using a fixed-size word vocabulary during training. In addition to controlling the model complexity, this limitation is also related to the difficulty of learning accurate word representations under conditions of high data sparsity. This problem is an important bottleneck on performance, especially in morphologically-rich languages, where the word vocabulary tends to be huge and sparse. In this dissertation, we propose to solve the vocabulary limitation problem in neural machine translation by integrating morphology learning within the translation model, aiding to learn richer word representations in terms of phonological and morphological information. Our model improves the accuracy while translating into low-resource and morphologically-rich languages and shows better generalization capability over varieties of languages with different morphological characteristics.

https://hdl.handle.net/11572/368927

Settore INF/01 - Informatica

Identifer	oai:union.ndltd.org:unitn.it/oai:iris.unitn.it:11572/368927
Date	January 2019
Creators	Ataman, Duygu
Contributors	Ataman, Duygu, Federico, Marcello
Publisher	Università degli studi di Trento, place:TRENTO
Source Sets	Università di Trento
Language	English
Detected Language	English
Type	info:eu-repo/semantics/doctoralThesis
Rights	info:eu-repo/semantics/openAccess
Relation	firstpage:1, lastpage:164, numberofpages:164

Page generated in 0.0024 seconds

Learning Morphology for Open-Vocabulary Neural Machine Translation

Description

Links & Downloads

Tags

Additional Fields