Global ETD Search

111	Going beyond the sentence : Contextual Machine Translation of Dialogue / Au-delà de la phrase : traduction automatique de dialogue en contexte Bawden, Rachel 29 November 2018 (has links) Les systèmes de traduction automatique (TA) ont fait des progrès considérables ces dernières années. La majorité d'entre eux reposent pourtant sur l'hypothèse que les phrases peuvent être traduites indépendamment les unes des autres. Ces modèles de traduction ne s'appuient que sur les informations contenues dans la phrase à traduire. Ils n'ont accès ni aux informations présentes dans les phrases environnantes ni aux informations que pourrait fournir le contexte dans lequel ces phrases ont été produites. La TA contextuelle a pour objectif de dépasser cette limitation en explorant différentes méthodes d'intégration du contexte extra-phrastique dans le processus de traduction. Les phrases environnantes (contexte linguistique) et le contexte de production des énoncés (contexte extra-linguistique) peuvent fournir des informations cruciales pour la traduction, notamment pour la prise en compte des phénomènes discursifs et des mécanismes référentiels. La prise en compte du contexte est toutefois un défi pour la traduction automatique. Évaluer la capacité de telles stratégies à prendre réellement en compte le contexte et à améliorer ainsi la qualité de la traduction est également un problème délicat, les métriques d'évaluation usuelles étant pour cela inadaptées voire trompeuses. Dans cette thèse, nous proposons plusieurs stratégies pour intégrer le contexte, tant linguistique qu'extra-linguistique, dans le processus de traduction. Nos expériences s'appuient sur des méthodes d'évaluation et des jeux de données que nous avons développés spécifiquement à cette fin. Nous explorons différents types de stratégies: les stratégies par pré-traitement, où l'on utilise le contexte pour désambiguïser les données fournies en entrée aux modèles ; les stratégies par post-traitement, où l'on utilise le contexte pour modifier la sortie d'un modèle non-contextuel, et les stratégies où l'on exploite le contexte pendant la traduction proprement dite. Nous nous penchons sur de multiples phénomènes contextuels, et notamment sur la traduction des pronoms anaphoriques, la désambiguïsation lexicale, la cohésion lexicale et l'adaptation à des informations extra-linguistiques telles que l'âge ou le genre du locuteur. Nos expériences, qui relèvent pour certaines de la TA statistique et pour d'autres de la TA neuronale, concernent principalement la traduction de l'anglais vers le français, avec un intérêt particulier pour la traduction de dialogues spontanés. / While huge progress has been made in machine translation (MT) in recent years, the majority of MT systems still rely on the assumption that sentences can be translated in isolation. The result is that these MT models only have access to context within the current sentence; context from other sentences in the same text and information relevant to the scenario in which they are produced remain out of reach. The aim of contextual MT is to overcome this limitation by providing ways of integrating extra-sentential context into the translation process. Context, concerning the other sentences in the text (linguistic context) and the scenario in which the text is produced (extra-linguistic context), is important for a variety of cases, such as discourse-level and other referential phenomena. Successfully taking context into account in translation is challenging. Evaluating such strategies on their capacity to exploit context is also a challenge, standard evaluation metrics being inadequate and even misleading when it comes to assessing such improvement in contextual MT. In this thesis, we propose a range of strategies to integrate both extra-linguistic and linguistic context into the translation process. We accompany our experiments with specifically designed evaluation methods, including new test sets and corpora. Our contextual strategies include pre-processing strategies designed to disambiguate the data on which MT models are trained, post-processing strategies to integrate context by post-editing MT outputs and strategies in which context is exploited during translation proper. We cover a range of different context-dependent phenomena, including anaphoric pronoun translation, lexical disambiguation, lexical cohesion and adaptation to properties of the scenario such as speaker gender and age. Our experiments for both phrase-based statistical MT and neural MT are applied in particular to the translation of English to French and focus specifically on the translation of informal written dialogues. Traduction automatique Apprentissage automatique Dialogue Évaluation Contexte Discours Machine translation Machine learning Dialogue Evaluation Context Discourse
112	Möglichkeiten und Grenzen der Maschinellen Übersetzung: Eine Evaluierung der Software Personal Translator für das Sprachenpaar Französisch - Deutsch Winter, Franziska 04 August 2014 (has links) keine Angabe info:eu-repo/classification/ddc/400 ddc:400 translation studies, machine translation
113	Strojový překlad do mnoha jazyků současně / Multi-Target Machine Translation Ihnatchenko, Bohdan January 2020 (has links) In international and highly-multilingual environments, it often happens, that a talk, a document, or any other input, needs to be translated into a massive number of other languages. However, it is not always an option to have a distinct system for each possible language pair due to the fact that training and operating such kind of translation systems is computationally demanding. Combining multiple target languages into one translation model usually causes a de- crease in quality of output for each its translation direction. In this thesis, we experiment with combinations of target languages to see, if a specific grouping of them can lead to better results than just randomly selecting target languages. We build upon a recent research on training a multilingual Transformer model without any change to its architecture: adding a target language tag to the source sentence. We trained a large number of bilingual and multilingual Transformer models and evaluated them on multiple test sets from different domains. We found that in most of the cases grouping related target languages into one model caused a better performance compared to models with randomly selected languages. However, we also found that a domain of the test set, as well as domains of data sampled into the training set, usu- ally have a more...
114	Analýza principů překladových technik využívaných online překladači a jejich porovnání s překladem klasickým / The analysis of the techniques used by online translators in comparison with the traditional form of translation Herejk, Martin January 2019 (has links) The pivotal intention of this thesis is to provide a comprehensive and detailed comparison of the contemporary machine translation and the traditional translation performed by a person. The theoretical part contains two segments which are essential towards establishing the background for the thesis. Firstly, a brief historical context is presented to illustrate how and why the concept of machine translation came to exist. Its relevance, utilization and basic technological principles are presented within the confines of the technology currently available. Secondly, an overview of specific language elements which commonly present an obstacle for translators, be it those possessing human or artificial intelligence, is elaborated and briefly commented upon. The practical part contains a selected body of text which is then translated, firstly by using a specific software and secondly by the author of the thesis. These translations are compared with regard to their accuracy, with references made to the mechanics of machine translation and the source code employed to perform the translation itself, effectively combining linguistic point of view with the technological aspect of the algorithm applied in the translation software. The final part contains a conclusion and summary of obtained results and...
115	Automatic Post-editing and Quality Estimation in Machine Translation of Product Descriptions Kukk, Kätriin January 2022 (has links) As a result of drastically improved machine translation quality in recent years, machine translation followed by manual post-editing is currently a trend in the language industry that is slowly but surely replacing manual translation from scratch. In this thesis, the applicability of machine translation to product descriptions of clothing items is studied. The focus lies on determining whether automatic post-editing is a viable approach for improving baseline translations when new training data becomes available and finding out if there is an existing quality estimation system that could reliably assign quality scores to machine translated texts. It is shown that machine translation is a promising approach for the target domain with the majority of systems experimented with being able to generate translations that on average are of almost publishable quality according to the human evaluation carried out, meaning that only light post-editing is needed before the translations can be published. Automatic post-editing is shown to be able to improve the worst baseline translations but struggles with improving the overall translation quality due to its tendency to overcorrect good translations. Nevertheless, one of the trained post-editing systems is still rated higher than the baseline by human evaluators. A new finding is that training a post-editing model on more data using worse translations leads to better performance compared to training on less but higher-quality data. None of the quality estimation systems experimented with shows a strong correlation with human evaluation results which is why it is suggested not to provide the confidence scores of the baseline model to the human evaluators responsible for correcting and approving translations. The main contributions of this work are showing that the target domain of product descriptions is suitable for integrating machine translation into the translation workflow, proposing an approach for that translation workflow that is more automated than the current one as well as the finding that it is better to use more data and poorer translations compared to less data and higher-quality translations when training an automatic post-editing system. machine translation automatic post-editing APE quality estimation QE
116	A Study on Manual and Automatic Evaluation Procedures and Production of Automatic Post-editing Rules for Persian Machine Translation Mostofian, Nasrin January 2017 (has links) Evaluation of machine translation is an important step towards improving MT. One way to evaluate the output of MT is to focus on different types of errors occurring in the translation hypotheses, and to think of possible solutions to fix those errors. An error categorization is a rather beneficent tool that makes it easy to analyze the translation errors and can also be utilized to manually generate post-editing rules to be applied automatically to the product of machine translation. In this work, we define a categorization for the errors occurring in Swedish--Persian machine translation by analyzing the errors that occur in three data-sets from two websites: 1177.se, and Linköping municipality. We define three types of monolingual reference free evaluation (MRF), and use two automatic metrics BLEU and TER, to conduct a bilingual evaluation for Swedish-Persian translation. Later on, based on the experience of working with the errors that occur in the corpora, we manually generate automatic post-editing (APE) rules and apply them to the product of machine translation. Three different sets of results are obtained: (1) The results of analyzing MT errors show that the three most common types of errors that occur in the translation hypotheses are mistranslated words, wrong word order, and extra prepositions. These types of errors are placed in semantic and syntactic categories respectively. (2) The results of comparing the correlation between the automatic and manual evaluation show a low correlation between the two evaluations. (3) Lastly, applying the APE rules to the product of machine translation gives an increase in BLEU score on the largest data-set while remaining almost unchanged on the other two data-sets. The results for TER show a better score on one data-set, while the scores on the two other data-sets remain unchanged. machine translation Persian automatic post-editing General Language Studies and Linguistics
117	Převod prózy do poezie pomocí neuronových sítí / Converting prose into poetry using neural networks Gokirmak, Memduh January 2021 (has links) Title: Converting Prose into Poetry with Neural Networks Author: Memduh Gokirmak Institute: Institute of Formal and Applied Linguistics Supervisor: Martin Popel, Institute of Formal and Applied Linguistics Abstract: We present here our attempts to create a system that generates poetry based on a sequence of text provided to it by a user. We explore the use of machine translation and language model technologies based on the neural network architecture. We use different types of data across three languages in our research, and employ and develop metrics to track the quality of the output of the systems we develop. We find that combining machine translation techniques to generate training data to this end with fine-tuning of pre-trained language models provides the most satisfactory generated poetry. Keywords: poetry machine translation language models iii
118	Kategorizace úprav strojového překladu při post-editaci: jazyková kombinace angličtina - čeština / Classification of Edit Categories in Machine Translation Post-Editing: English-Czech Language Combination Kopecká, Klára January 2021 (has links) Today, a translation job does not only mean transferring content from one language to another by utilizing one's own knowledge of two languages and subject matter expertise. Today, translation often means working with suggestions from various resources, including machine translation. The popularity of machine translation post-editing (MTPE) as a form of translation is growing. That is why this sklil should be acquired by translation students prior to entering the market. In order to work with machine translation efficiently, not only knowledge of the basic principles of how machine translation engines and translation technology work is needed, but also being able to assess the relevance of each linguistic edit with regards to the assigned instructions and purpose of the translated text. The aim of this master's thesis is to analyze linguistic edits carried out during an MTPE job from English to Czech. Identified edits are then classified, resulting in a list of linguistic edit categories in English-to-Czech MTPE. KEY WORDS MTPE, PEMT, post-editing, machine translation, classification, edits
119	Integrated Parallel Data Extraction from Comparable Corpora for Statistical Machine Translation / 統計的機械翻訳におけるコンパラブルコーパスからの対訳データの統合的抽出 Chu, Chenhui 23 March 2015 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第19107号 / 情博第553号 / 新制\|\|情\|\|98(附属図書館) / 32058 / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授黒橋禎夫, 教授石田亨, 教授河原達也 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Statistical Machine Translation Comparable Corpora Bilingual Lexicon Extraction Parallel Sentence Extraction Parallel Fragment Extraction 007
120	Entity-Centric Discourse Analysis and Its Applications / エンティティに注目した談話解析とその応用 Wang, Xun 24 November 2017 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20777号 / 情博第657号 / 新制\|\|情報\|\|113(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授黒橋禎夫, 教授河原達也, 教授石田亨 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Discourse Analysis Entity Deep Structure Text Representation Tranasfer-Based Machine Translation 007

Search results