This paper describes the process of building a machine translation lexicon for use in the train and transport domain with the machine translation system MATS. The lexicon will consist of a Swedish part, an English part and links between them and is derived from a Trados translation memory which is split into a training(90%) part and a testing(10%) part. The task is carried out mainly by using existing word linking software and recycling previous machine translation lexicons from other domains. In order to do this, a method is developed where focus lies on automation by means of both existing and self developed software, in combination with manual interaction. The domain specific lexicon is then extended with a domain neutral core lexicon and a less domain neutral general lexicon. The different lexicons are automatically and manually evaluated through machine translation on the test corpus. The automatic evaluation of the largest lexicon yielded a NEVA score of 0.255 and a BLEU score of 0.190. The manual evaluation saw 34% of the segments correctly translated, 37%, although not correct, perfectly understandable and 29% difficult to understand.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-8269 |
Date | January 2006 |
Creators | Axelsson, Hans, Blom, Oskar |
Publisher | Uppsala universitet, Institutionen för lingvistik och filologi, Uppsala universitet, Institutionen för lingvistik och filologi, Uppsala : Institutionen för lingvistik och filologi |
Source Sets | DiVA Archive at Upsalla University |
Language | Swedish |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0093 seconds