1 |
Implementation och utvärdering av termlänkare i JavaAxelsson, Robin January 2013 (has links)
Aligning parallell terms in a parallell corpus can be done by aligning all words and phrases in the corpus and then performing term extraction on the aligned set of word pairs. Alternatively, term extraction in the source and target text can be made separately and then the resulting term candidates can be aligned, forming aligned parallell terms. This thesis describes an implementation of a word aligner that is applied on extracted term candidates in both the source and the target texts. The term aligner uses statistical measures, the tool Giza++ and heuristics in the search for alignments. The evaluation reveals that the best results are obtained when the term alignment relies heavily on the Giza++ tool and Levenshtein heuristic.
|
2 |
To bury a child : A spatial analysis of child burials at Giza and Saqqara / Att begrava ett barn : En rumslig analys av barngravar i Giza och SaqqaraHedin Käck, Mimmi January 2024 (has links)
This thesis investigates if there is any age segmentation among child burials. This is done through a spatial analysis, to better understand children in a mortuary space. The two cemeteries that will be investigated are the Wall of the Crow in Late Period Giza and the so-called Upper Necropolis in Saqqara during the Ptolemaic Period. The study includes 83 child burials from Saqqara and 73 child burials from Giza, of newborns to children of 14 years of age. This is achieved by deconstructing the data available to clarify the bond that exists and/or does not exist between the burials. In addition, child mortality and burial customs will be discussed to better understand burial rates in comparison to mortality rates. Finally, to understand properly the cemetery space, and children as a sub-set in it, the wider constructed landscapes are considered. The outcome of this thesis was that no cemetery had any age segmentation. / Den här uppsatsen utforskar om det finns någon ålderssegmentering bland barngravar. Detta utförs genom en rumslig analys för att bättre förstå barn i begravningsutrymmet. De två begravningsplatserna som undersöks är Wall of the Crow under Sentiden i Giza och den så kallade Övre Nekropolen i Saqqara, daterad till Ptolemeiska Riket. Studien inkluderar 83 barngravar från Saqqara och 73 barngravar från Giza, från nyfödda till 14 års ålder. Detta uppnås av att rekonstruera den tillgängliga data för att klargöra bandet som finns/inte finns mellan gravarna. Dessutom kommer barnadödlighet och begravningsseder diskuteras för att bättre förstå gravarna i jämförelse med dödlighet. Slutligen, för att förstå begravningsutrymmet bättre och barnen som en del av det, kommer de bredare konstruerade landskapen beaktas. Resultatet av uppsatsen var att ingen av begravningsplatserna hade någon ålderssegmentering.
|
3 |
Comparaison de deux techniques de décodage pour la traduction probabilisteAwdé, Ali January 2003 (has links)
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
|
4 |
Utveckling av ett verktyg för länkning och bedömning av översättningarEriksson, Joel January 2015 (has links)
Idag finns det m˚anga system f¨or att bed¨oma och tolka ¨overs¨attningar av texter. Det finns system som l¨ankar delar av en k¨alltext och en ¨overs¨attning, det finns en ¨aven tekniker f¨or att bed¨oma ¨overs¨attningar f¨or ge ett m˚att p˚a hur bra de ¨ar. Ett exempel p˚a en s˚adan teknik ¨ar Token Equivalence Method(TEM). Det finns dock f˚a program, om n˚agra, som utnyttjar b˚ade l¨ankning och bed¨omning p˚a ett s˚adant s¨att att de skulle kunna vara anv¨andbara vid till exempel spr˚akutbildningar. I detta arbete utvecklas just ett s˚adant program. Programmet som skapats kan segmentera och l¨anka parallella texter mot varandra helt automatiskt via inkopplade system. F¨or att ¨oka anv¨andarv¨anligheten s˚a visualiserar programmet ¨aven l¨ankningen och till˚ater redigering av b˚ade segmentering och l¨ankning. L¨ankningen utnyttjas sedan f¨or att r¨akna ut och visa delar av TEM f¨or att ge ett m˚att p˚a ¨overs¨attningens kvalit´e.
|
5 |
Překladač z češtiny do slovenštiny / Czech-Slovak Machine TranslationMydliar, Ján January 2013 (has links)
This Master thesis deals with machine translation from Czech to Slovak. The first chapter motivates the work, the second discusses various approaches to machine translation and the third details evaluation of the methods. Chapter 4 introduces the design and implementation of my system, paying a special attention to a new parallel corpus that has been created. Chapter 5 summarizes testing and evaluation of the developed system.
|
6 |
Evaluation of two word alignment systemsWang, Xiaoyang January 2004 (has links)
<p>This project evaluates two different systems that generate wordalignments on English-Swedish data. The systems to be used are the Giza++ system, that may generate a variety of statistical translation models, and I*Trix system developed at IDA/NLPLab that generates word pairs with frequencies. </p><p>The file formats of these two systems, the way of running them and the differences of the two systems are addressed in this paper. Evaluation in this project considers a variety of parameters such as corpus size, characteristics of the corpus, the effect of linguistic knowledge, etc. At the end of this paper, the conclusions of the two systems evaluation are presented. In general, Giza++ is better applying on big corpora while I*Trix is better for small corpora. Especially for corpora with high statistical ratio or special resource, I*Trix has a better performance.</p>
|
7 |
Evaluation of two word alignment systemsWang, Xiaoyang January 2004 (has links)
This project evaluates two different systems that generate wordalignments on English-Swedish data. The systems to be used are the Giza++ system, that may generate a variety of statistical translation models, and I*Trix system developed at IDA/NLPLab that generates word pairs with frequencies. The file formats of these two systems, the way of running them and the differences of the two systems are addressed in this paper. Evaluation in this project considers a variety of parameters such as corpus size, characteristics of the corpus, the effect of linguistic knowledge, etc. At the end of this paper, the conclusions of the two systems evaluation are presented. In general, Giza++ is better applying on big corpora while I*Trix is better for small corpora. Especially for corpora with high statistical ratio or special resource, I*Trix has a better performance.
|
8 |
Automatická tvorba slovníků z překladových textů / Automatic Creation of Dictionaries from TranslationsMusil, Jakub January 2010 (has links)
Aim of this thesis is to implement system for translation words from source language into the target language with pair input texts. There are descriptions of terms and methods used in machine translation and machine build dictionary. The thesis also contains a concept and specification of each part created system including final evaluation. There is analysed options which make extension of existing dictionatry.
|
9 |
Překlad z češtiny do angličtiny / Czech-English TranslationPetrželka, Jiří January 2010 (has links)
Tato diplomová práce popisuje principy statistického strojového překladu a demonstruje, jak sestavit systém pro statistický strojový překlad Moses. V přípravné fázi jsou prozkoumány volně dostupné bilingvní česko-anglické korpusy. Empirická analýza časové náročnosti vícevláknových nástrojů pro zarovnání slov demonstruje, že MGIZA++ může dosáhnout až pětinásobného zrychlení, zatímco PGIZA++ až osminásobného zrychlení (v porovnání s GIZA++). Jsou otestovány tři způsoby morfologického pre-processingu českých trénovacích dat za použití jednoduchých nefaktorových modelů. Zatímco jednoduchá lemmatizace může snížit BLEU, sofistikovanější přístupy většinou BLEU zvyšují. Positivní efekty morfologického pre-processingu se vytrácejí s růstem velikosti korpusu. Vztah mezi dalšími charakteristikami korpusu (velikost, žánr, další data) a výsledným BLEU je empiricky měřen. Koncový systém je natrénován na korpusu CzEng 0.9 a vyhodnocen na testovacím vzorku z workshopu WMT 2010.
|
Page generated in 0.0245 seconds