Global ETD Search

1	Implementation och utvärdering av termlänkare i Java Axelsson, Robin January 2013 (has links) Aligning parallell terms in a parallell corpus can be done by aligning all words and phrases in the corpus and then performing term extraction on the aligned set of word pairs. Alternatively, term extraction in the source and target text can be made separately and then the resulting term candidates can be aligned, forming aligned parallell terms. This thesis describes an implementation of a word aligner that is applied on extracted term candidates in both the source and the target texts. The term aligner uses statistical measures, the tool Giza++ and heuristics in the search for alignments. The evaluation reveals that the best results are obtained when the term alignment relies heavily on the Giza++ tool and Levenshtein heuristic. termlänkning giza++
2	To bury a child : A spatial analysis of child burials at Giza and Saqqara / Att begrava ett barn : En rumslig analys av barngravar i Giza och Saqqara Hedin Käck, Mimmi January 2024 (has links) This thesis investigates if there is any age segmentation among child burials. This is done through a spatial analysis, to better understand children in a mortuary space. The two cemeteries that will be investigated are the Wall of the Crow in Late Period Giza and the so-called Upper Necropolis in Saqqara during the Ptolemaic Period. The study includes 83 child burials from Saqqara and 73 child burials from Giza, of newborns to children of 14 years of age. This is achieved by deconstructing the data available to clarify the bond that exists and/or does not exist between the burials. In addition, child mortality and burial customs will be discussed to better understand burial rates in comparison to mortality rates. Finally, to understand properly the cemetery space, and children as a sub-set in it, the wider constructed landscapes are considered. The outcome of this thesis was that no cemetery had any age segmentation. / Den här uppsatsen utforskar om det finns någon ålderssegmentering bland barngravar. Detta utförs genom en rumslig analys för att bättre förstå barn i begravningsutrymmet. De två begravningsplatserna som undersöks är Wall of the Crow under Sentiden i Giza och den så kallade Övre Nekropolen i Saqqara, daterad till Ptolemeiska Riket. Studien inkluderar 83 barngravar från Saqqara och 73 barngravar från Giza, från nyfödda till 14 års ålder. Detta uppnås av att rekonstruera den tillgängliga data för att klargöra bandet som finns/inte finns mellan gravarna. Dessutom kommer barnadödlighet och begravningsseder diskuteras för att bättre förstå gravarna i jämförelse med dödlighet. Slutligen, för att förstå begravningsutrymmet bättre och barnen som en del av det, kommer de bredare konstruerade landskapen beaktas. Resultatet av uppsatsen var att ingen av begravningsplatserna hade någon ålderssegmentering. Child burials child mortality age landscape Late Period Ptolemaic Period Giza Saqqara Barngravar barnadödlighet ålder landskap Giza Saqqara Sentiden Ptolemeiska Riket Humanities and the Arts Humaniora och konst
3	Comparaison de deux techniques de décodage pour la traduction probabiliste Awdé, Ali January 2003 (has links) Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal. Traduction automatique statistique Algorithme de recherche Modèle d'alignement Algorithme vorace Algorithme DP GIZA
4	Utveckling av ett verktyg för länkning och bedömning av översättningar Eriksson, Joel January 2015 (has links) Idag finns det m˚anga system f¨or att bed¨oma och tolka ¨overs¨attningar av texter. Det finns system som l¨ankar delar av en k¨alltext och en ¨overs¨attning, det finns en ¨aven tekniker f¨or att bed¨oma ¨overs¨attningar f¨or ge ett m˚att p˚a hur bra de ¨ar. Ett exempel p˚a en s˚adan teknik ¨ar Token Equivalence Method(TEM). Det finns dock f˚a program, om n˚agra, som utnyttjar b˚ade l¨ankning och bed¨omning p˚a ett s˚adant s¨att att de skulle kunna vara anv¨andbara vid till exempel spr˚akutbildningar. I detta arbete utvecklas just ett s˚adant program. Programmet som skapats kan segmentera och l¨anka parallella texter mot varandra helt automatiskt via inkopplade system. F¨or att ¨oka anv¨andarv¨anligheten s˚a visualiserar programmet ¨aven l¨ankningen och till˚ater redigering av b˚ade segmentering och l¨ankning. L¨ankningen utnyttjas sedan f¨or att r¨akna ut och visa delar av TEM f¨or att ge ett m˚att p˚a ¨overs¨attningens kvalit´e. giza++ hunalign bitext word alignment token equivalence method Computer Sciences Datavetenskap (datalogi)
5	Překladač z češtiny do slovenštiny / Czech-Slovak Machine Translation Mydliar, Ján January 2013 (has links) This Master thesis deals with machine translation from Czech to Slovak. The first chapter motivates the work, the second discusses various approaches to machine translation and the third details evaluation of the methods. Chapter 4 introduces the design and implementation of my system, paying a special attention to a new parallel corpus that has been created. Chapter 5 summarizes testing and evaluation of the developed system.
6	Evaluation of two word alignment systems Wang, Xiaoyang January 2004 (has links) <p>This project evaluates two different systems that generate wordalignments on English-Swedish data. The systems to be used are the Giza++ system, that may generate a variety of statistical translation models, and ITrix system developed at IDA/NLPLab that generates word pairs with frequencies. </p><p>The file formats of these two systems, the way of running them and the differences of the two systems are addressed in this paper. Evaluation in this project considers a variety of parameters such as corpus size, characteristics of the corpus, the effect of linguistic knowledge, etc. At the end of this paper, the conclusions of the two systems evaluation are presented. In general, Giza++ is better applying on big corpora while ITrix is better for small corpora. Especially for corpora with high statistical ratio or special resource, ITrix has a better performance.</p> Datalogi Word alignment Giza++ ITrix Parallel corpora Statistical ratio Evaluation I*Eval Gold standard. Datalogi Computer science Datalogi
7	Evaluation of two word alignment systems Wang, Xiaoyang January 2004 (has links) This project evaluates two different systems that generate wordalignments on English-Swedish data. The systems to be used are the Giza++ system, that may generate a variety of statistical translation models, and ITrix system developed at IDA/NLPLab that generates word pairs with frequencies. The file formats of these two systems, the way of running them and the differences of the two systems are addressed in this paper. Evaluation in this project considers a variety of parameters such as corpus size, characteristics of the corpus, the effect of linguistic knowledge, etc. At the end of this paper, the conclusions of the two systems evaluation are presented. In general, Giza++ is better applying on big corpora while ITrix is better for small corpora. Especially for corpora with high statistical ratio or special resource, ITrix has a better performance. Datalogi Word alignment Giza++ ITrix Parallel corpora Statistical ratio Evaluation I*Eval Gold standard. Datalogi Computer Sciences Datavetenskap (datalogi)
8	Automatická tvorba slovníků z překladových textů / Automatic Creation of Dictionaries from Translations Musil, Jakub January 2010 (has links) Aim of this thesis is to implement system for translation words from source language into the target language with pair input texts. There are descriptions of terms and methods used in machine translation and machine build dictionary. The thesis also contains a concept and specification of each part created system including final evaluation. There is analysed options which make extension of existing dictionatry.
9	Překlad z češtiny do angličtiny / Czech-English Translation Petrželka, Jiří January 2010 (has links) Tato diplomová práce popisuje principy statistického strojového překladu a demonstruje, jak sestavit systém pro statistický strojový překlad Moses. V přípravné fázi jsou prozkoumány volně dostupné bilingvní česko-anglické korpusy. Empirická analýza časové náročnosti vícevláknových nástrojů pro zarovnání slov demonstruje, že MGIZA++ může dosáhnout až pětinásobného zrychlení, zatímco PGIZA++ až osminásobného zrychlení (v porovnání s GIZA++). Jsou otestovány tři způsoby morfologického pre-processingu českých trénovacích dat za použití jednoduchých nefaktorových modelů. Zatímco jednoduchá lemmatizace může snížit BLEU, sofistikovanější přístupy většinou BLEU zvyšují. Positivní efekty morfologického pre-processingu se vytrácejí s růstem velikosti korpusu. Vztah mezi dalšími charakteristikami korpusu (velikost, žánr, další data) a výsledným BLEU je empiricky měřen. Koncový systém je natrénován na korpusu CzEng 0.9 a vyhodnocen na testovacím vzorku z workshopu WMT 2010.

Search results