Global ETD Search

Return to search

Automatic Recognition and Classification of Translation Errors in Human Translation / Automatisk igenkänning och klassificering av fel i mänsklig översättning

Grading assignments is a time-consuming part of teaching translation. Automatic tools that facilitate this task would allow teachers of professional translation to focus more on other aspects of their job. Within Natural Language Processing, error recognitionhas not been studied for human translation in particular. This thesis is a first attempt at both error recognition and classification with both mono- and bilingual models. BERT– a pre-trained monolingual language model – and NuQE – a model adapted from the field of Quality Estimation for Machine Translation – are trained on a relatively small hand annotated corpus of student translations. Due to the nature of the task, errors are quite rare in relation to correctly translated tokens in the corpus. To account for this,we train the models with both under- and oversampled data. While both models detect errors with moderate success, the NuQE model adapts very poorly to the classification setting. Overall, scores are quite low, which can be attributed to class imbalance and the small amount of training data, as well as some general concerns about the corpus annotations. However, we show that powerful monolingual language models can detect formal, lexical and translational errors with some success and that, depending on the model, simple under- and oversampling approaches can already help a great deal to avoid pure majority class prediction.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-420289

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-420289
Date	January 2020
Creators	Dürlich, Luise
Publisher	Uppsala universitet, Institutionen för lingvistik och filologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0016 seconds

Automatic Recognition and Classification of Translation Errors in Human Translation / Automatisk igenkänning och klassificering av fel i mänsklig översättning

Description

Links & Downloads

Tags

Additional Fields