Global ETD Search

Return to search

Translationese and Swedish-English Statistical Machine Translation

This thesis investigates how well machine learned classifiers can identify translated text, and the effect translationese may have in Statistical Machine Translation -- all in a Swedish-to-English, and reverse, context. Translationese is a term used to describe the dialect of a target language that is produced when a source text is translated. The systems trained for this thesis are SVM-based classifiers for identifying translationese, as well as translation and language models for Statistical Machine Translation. The classifiers successfully identified translationese in relation to non-translated text, and to some extent, also what source language the texts were translated from. In the SMT experiments, variation of the translation model was whataffected the results the most in the BLEU evaluation. Systems configured with non-translated source text and translationese target text performed better than their reversed counter parts. The language model experiments showed that those trained on known translationese and classified translationese performed better than known non-translated text, though classified translationese did not perform as well as the known translationese. Ultimately, the thesis shows that translationese can be identified by machine learned classifiers and may affect the results of SMT systems.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-305199

Translationese

Statistical Machine Translation

Text Classification

Classification of Translationese

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-305199
Date	January 2016
Creators	Joelsson, Jakob
Publisher	Uppsala universitet, Institutionen för lingvistik och filologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.1605 seconds

Translationese and Swedish-English Statistical Machine Translation

Description

Links & Downloads

Tags

Additional Fields