Return to search

English to ASL Gloss Machine Translation

Low-resource languages, including sign languages, are a challenge for machine translation research. Given the lack of parallel corpora, current researchers must be content with a small parallel corpus in a narrow domain for training a system. For this thesis, we obtained a small parallel corpus of English text and American Sign Language gloss from The Church of Jesus Christ of Latter-day Saints. We cleaned the corpus by loading it into an open-source translation memory tool, where we removed computer markup language and split the large chunks of text into sentences and phrases, creating a total of 14,247 sentence pairs. We randomly partitioned the corpus into three sections: 70% for a training set, 10% for a development set, and 20% for a test set. After downloading and installing the open-source Moses toolkit, we went through several iterations of training, translating, and evaluating the system. The final evaluation on unseen data yielded a state-of-the-art score for a low-resource language.

Identiferoai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-6477
Date01 June 2015
CreatorsBonham, Mary Elizabeth
PublisherBYU ScholarsArchive
Source SetsBrigham Young University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceTheses and Dissertations
Rightshttp://lib.byu.edu/about/copyright/

Page generated in 0.0018 seconds