In this thesis I present an unsupervised approach that can be made supervised in order to reducetranslation of changes in structured information, stored in XML-documents. By combining a sentenceboundary detection algorithm and a sentence alignment algorithm, a translation memory is createdfrom the old version of the information in different languages. This translation memory can then beused to translate sentences that are not changed. The structure of the XML is used to improve theperformance. Two implementations were made and evaluated in three steps: sentence boundary detection,sentence alignment and correspondence. The last step evaluates the using of the translation memoryon a new version in the source language. The second implementation was an improvement, using theresults of the evaluation of the first implementation. The evaluation was done using 100 XML-documents in English, German and Swedish. There was a significant difference between the results ofthe implementations in the first two steps. The errors were reduced by each step and in the last stepthere were only three errors by first implementation and no errors by the second implementation. The evaluation of the implementations showed that it was possible to reduce text that requires re-translation by about 80%. Similar information can and is used by the translators to achieve higherproductivity, but this thesis shows that it is possible to reduce translation even before the textsreaches the translators.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-79363 |
Date | January 2012 |
Creators | Resman, Daniel |
Publisher | Linköpings universitet, Interaktiva och kognitiva system, Linköpings universitet, Tekniska högskolan |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0025 seconds