1 |
Automatic subtitling: A new paradigmKarakanta, Alina 11 November 2022 (has links)
Audiovisual Translation (AVT) is a field where Machine Translation (MT) has long found limited success mainly due to the multimodal nature of the source and the formal requirements of the target text. Subtitling is the predominant AVT type, quickly and easily providing access to the vast amounts of audiovisual content becoming available daily. Automation in subtitling has so far focused on MT systems which translate source language subtitles, already transcribed and timed by humans. With recent developments in speech translation (ST), the time is ripe for extended automation in subtitling, with end-to-end solutions for obtaining target language subtitles directly from the source speech. In this thesis, we address the key steps for accomplishing the new paradigm of automatic subtitling: data, models and evaluation. First, we address the lack of representative data by compiling MuST-Cinema, a speech-to-subtitles corpus. Segmenter models trained on MuST-Cinema accurately split sentences into subtitles, and enable automatic data augmentation techniques. Having representative data at hand, we move to developing direct ST models for three scenarios: offline subtitling, dual subtitling, live subtitling. Lastly, we propose methods for evaluating subtitle-specific aspects, such as metrics for subtitle segmentation, a product- and process-based exploration of the effect of spotting changes in the subtitle post-editing process, and finally, a comprehensive survey on subtitlers' user experience and views on automatic subtitling. Our findings show the potential of speech technologies for extending automation in subtitling to provide multilingual access to information and communication.
|
2 |
Using Machine Learning Methods for Evaluating the Quality of Technical DocumentsLuckert, Michael, Schaefer-Kehnert, Moritz January 2016 (has links)
In the context of an increasingly networked world, the availability of high quality translations is critical for success in the context of the growing international competition. Large international companies as well as medium sized companies are required to provide well translated, high quality technical documentation for their customers not only to be successful in the market but also to meet legal regulations and to avoid lawsuits. Therefore, this thesis focuses on the evaluation of translation quality, specifically concerning technical documentation, and answers two central questions: How can the translation quality of technical documents be evaluated, given the original document is available? How can the translation quality of technical documents be evaluated, given the original document is not available? These questions are answered using state-of-the-art machine learning algorithms and translation evaluation metrics in the context of a knowledge discovery process. The evaluations are done on a sentence level and recombined on a document level by binarily classifying sentences as automated translation and professional translation. The research is based on a database containing 22, 327 sentences and 32 translation evaluation attributes, which are used for optimizations of five different machine learning approaches. An optimization process consisting of 795, 000 evaluations shows a prediction accuracy of up to 72.24% for the binary classification. Based on the developed sentence-based classifi- cation systems, documents are classified using recombination of the affiliated sentences and a framework for rating document quality is introduced. Therefore, the taken approach successfully creates a classification and evaluation system.
|
3 |
Cohesion and Comprehensibility in Polish-English Machine Translated TextsWeiss, Sandra January 2011 (has links)
This paper is a study of Polish-English machine translation, where the impact of various types of errors on cohesion and comprehensibility of the translations was investigated. The following phenomena were in focus: 1. The most common errors produced by current state-of-the-art MT systems for Polish-English MT. 2. The effect of various types of errors on text cohesion. 3. The effect of various types of errors on readers’ understanding of the translation.
|
4 |
Comparing Encoder-Decoder Architectures for Neural Machine Translation: A Challenge Set ApproachDoan, Coraline 19 November 2021 (has links)
Machine translation (MT) as a field of research has known significant advances in recent years, with the increased interest for neural machine translation (NMT). By combining deep learning with translation, researchers have been able to deliver systems that perform better than most, if not all, of their predecessors. While the general consensus regarding NMT is that it renders higher-quality translations that are overall more idiomatic, researchers recognize that NMT systems still struggle to deal with certain classic difficulties, and that their performance may vary depending on their architecture. In this project, we implement a challenge-set based approach to the evaluation of examples of three main NMT architectures: convolutional neural network-based systems (CNN), recurrent neural network-based (RNN) systems, and attention-based systems, trained on the same data set for English to French translation. The challenge set focuses on a selection of lexical and syntactic difficulties (e.g., ambiguities) drawn from literature on human translation, machine translation, and writing for translation, and also includes variations in sentence lengths and structures that are recognized as sources of difficulties even for NMT systems. This set allows us to evaluate performance in multiple areas of difficulty for the systems overall, as well as to evaluate any differences between architectures’ performance. Through our challenge set, we found that our CNN-based system tends to reword sentences, sometimes shifting their meaning, while our RNN-based system seems to perform better when provided with a larger context, and our attention-based system seems to struggle the longer a sentence becomes.
|
5 |
The Multidimensional Quality Metric (MQM) Framework: A New Framework for Translation Quality AssessmentMariana, Valerie Ruth 01 December 2014 (has links) (PDF)
This document is a supplement to the article entitled “The Multidimensional Quality Metric (MQM) Framework: A New Framework for Translation Quality Assessment”, which has been acepted for publication in the upcoming January volume of JoSTrans, the Journal of Specialized Translation. The article is a coauthored project between Dr. Alan K. Melby, Dr. Troy Cox and myself. In this document you will find a preface describing the process of writing the article, an annotated bibliography of sources consulted in my research, a summary of what I learned, and a conclusion that considers the future avenues opened up by this research. Our article examines a new method for assessing the quality of a translation known as the Multidimensional Quality Metric, MQM. In our experiment we set the MQM framework to mirror, as closely as possible, the American Translators Association's (ATA) translator certification exam. To do this we mapped the ATA error categories to corresponding MQM error categories. We acquired a set of 29 student translations and had a group of student raters use the MQM framework to rate these translations. We measured the practicality of the MQM framework by comparing the time required for ratings to the average time required to rate translations in the industry. In addition, we had 2 ATA certified translators rate the anchor translation (a translation that was scored by every rater in order to have a point of comparison). The certified translators' ratings were used to verify that the scores given by the student raters were valid. Reliability was also measured, which found that the student raters were not interchangeable, but that the measurement estimate of reliability was adequate. The article's goal was to determine the extent to which the Multidimensional Quality Metric framework for translation evaluation is viable (practical, reliable and valid) when designed to mirror the ATA certification exam. Overall, the results of the experiment showed that MQM could be a viable way to rate translation quality when operationalized based on the ATA's translator certification exam. This is an important discovery in the field of translation quality, because it shows that MQM could be a viable tool for future researchers. Our experiment suggests that researchers ought to take advantage of the MQM framework because, not only is it free, but any studies completed using the MQM framework would have a common base, making these studies more easily comparable.
|
6 |
Entity-based coherence in statistical machine translation : a modelling and evaluation perspectiveWetzel, Dominikus Emanuel January 2018 (has links)
Natural language documents exhibit coherence and cohesion by means of interrelated structures both within and across sentences. Sentences do not stand in isolation from each other and only a coherent structure makes them understandable and sound natural to humans. In Statistical Machine Translation (SMT) only little research exists on translating a document from a source language into a coherent document in the target language. The dominant paradigm is still one that considers sentences independently from each other. There is both a need for a deeper understanding of how to handle specific discourse phenomena, and for automatic evaluation of how well these phenomena are handled in SMT. In this thesis we explore an approach how to treat sentences as dependent on each other by focussing on the problem of pronoun translation as an instance of a discourse-related non-local phenomenon. We direct our attention to pronoun translation in the form of cross-lingual pronoun prediction (CLPP) and develop a model to tackle this problem. We obtain state-of-the-art results exhibiting the benefit of having access to the antecedent of a pronoun for predicting the right translation of that pronoun. Experiments also showed that features from the target side are more informative than features from the source side, confirming linguistic knowledge that referential pronouns need to agree in gender and number with their target-side antecedent. We show our approach to be applicable across the two language pairs English-French and English-German. The experimental setting for CLPP is artificially restricted, both to enable automatic evaluation and to provide a controlled environment. This is a limitation which does not yet allow us to test the full potential of CLPP systems within a more realistic setting that is closer to a full SMT scenario. We provide an annotation scheme, a tool and a corpus that enable evaluation of pronoun prediction in a more realistic setting. The annotated corpus consists of parallel documents translated by a state-of-the-art neural machine translation (NMT) system, where the appropriate target-side pronouns have been chosen by annotators. With this corpus, we exhibit a weakness of our current CLPP systems in that they are outperformed by a state-of-the-art NMT system in this more realistic context. This corpus provides a basis for future CLPP shared tasks and allows the research community to further understand and test their methods. The lack of appropriate evaluation metrics that explicitly capture non-local phenomena is one of the main reasons why handling non-local phenomena has not yet been widely adopted in SMT. To overcome this obstacle and evaluate the coherence of translated documents, we define a bilingual model of entity-based coherence, inspired by work on monolingual coherence modelling, and frame it as a learning-to-rank problem. We first evaluate this model on a corpus where we artificially introduce coherence errors based on typical errors CLPP systems make. This allows us to assess the quality of the model in a controlled environment with automatically provided gold coherence rankings. Results show that this model can distinguish with high accuracy between a human-authored translation and one with coherence errors, that it can also distinguish between document pairs from two corpora with different degrees of coherence errors, and that the learnt model can be successfully applied when the test set distribution of errors comes from a different one than the one from the training data, showing its generalization potentials. To test our bilingual model of coherence as a discourse-aware SMT evaluation metric, we apply it to more realistic data. We use it to evaluate a state-of-the-art NMT system against post-editing systems with pronouns corrected by our CLPP systems. For verifying our metric, we reuse our annotated parallel corpus and consider the pronoun annotations as proxy for human document-level coherence judgements. Experiments show far lower accuracy in ranking translations according to their entity-based coherence than on the artificial corpus, suggesting that the metric has difficulties generalizing to a more realistic setting. Analysis reveals that the system translations in our test corpus do not differ in their pronoun translations in almost half of the document pairs. To circumvent this data sparsity issue, and to remove the need for parameter learning, we define a score-based SMT evaluation metric which directly uses features from our bilingual coherence model.
|
7 |
The mat sat on the cat : investigating structure in the evaluation of order in machine translationMcCaffery, Martin January 2017 (has links)
We present a multifaceted investigation into the relevance of word order in machine translation. We introduce two tools, DTED and DERP, each using dependency structure to detect differences between the structures of machine-produced translations and human-produced references. DTED applies the principle of Tree Edit Distance to calculate edit operations required to convert one structure into another. Four variants of DTED have been produced, differing in the importance they place on words which match between the two sentences. DERP represents a more detailed procedure, making use of the dependency relations between words when evaluating the disparities between paths connecting matching nodes. In order to empirically evaluate DTED and DERP, and as a standalone contribution, we have produced WOJ-DB, a database of human judgments. Containing scores relating to translation adequacy and more specifically to word order quality, this is intended to support investigations into a wide range of translation phenomena. We report an internal evaluation of the information in WOJ-DB, then use it to evaluate variants of DTED and DERP, both to determine their relative merit and their strength relative to third-party baselines. We present our conclusions about the importance of structure to the tools and their relevance to word order specifically, then propose further related avenues of research suggested or enabled by our work.
|
8 |
A critical investigation of deaf comprehension of signed tv news interpretationWehrmeyer, Jennifer Ella January 2013 (has links)
This study investigates factors hampering comprehension of sign language interpretations rendered on South African TV news bulletins in terms of Deaf viewers’ expectancy norms and corpus analysis of authentic interpretations. The research fills a gap in the emerging discipline of Sign Language Interpreting Studies, specifically with reference to corpus studies. The study presents a new model for translation/interpretation evaluation based on the introduction of Grounded Theory (GT) into a reception-oriented model. The research question is addressed holistically in terms of target audience competencies and expectations, aspects of the physical setting, interpreters’ use of language and interpreting choices. The South African Deaf community are incorporated as experts into the assessment process, thereby empirically grounding the research within the socio-dynamic context of the target audience. Triangulation in data collection and analysis was provided by applying multiple mixed data collection methods, namely questionnaires, interviews, eye-tracking and corpus tools. The primary variables identified by the study are the small picture size and use of dialect. Secondary variables identified include inconsistent or inadequate use of non-manual features, incoherent or non-simultaneous mouthing, careless or incorrect sign execution, too fast signing, loss of visibility against skin or clothing, omission of vital elements of sentence structure, adherence to source language structures, meaningless additions, incorrect referencing, oversimplification and violations of Deaf norms of restructuring, information transfer, gatekeeping and third person interpreting. The identification of these factors allows the construction of a series of testable hypotheses, thereby providing a broad platform for further research. Apart from pioneering corpus-driven sign language interpreting research, the study makes significant contributions to present knowledge of evaluative models, interpreting strategies and norms and systems of transcription and annotation. / Linguistics / Thesis (D. Litt.et Phil. (Linguistics)
|
9 |
A critical investigation of deaf comprehension of signed tv news interpretationWehrmeyer, Jennifer Ella January 2013 (has links)
This study investigates factors hampering comprehension of sign language interpretations rendered on South African TV news bulletins in terms of Deaf viewers’ expectancy norms and corpus analysis of authentic interpretations. The research fills a gap in the emerging discipline of Sign Language Interpreting Studies, specifically with reference to corpus studies. The study presents a new model for translation/interpretation evaluation based on the introduction of Grounded Theory (GT) into a reception-oriented model. The research question is addressed holistically in terms of target audience competencies and expectations, aspects of the physical setting, interpreters’ use of language and interpreting choices. The South African Deaf community are incorporated as experts into the assessment process, thereby empirically grounding the research within the socio-dynamic context of the target audience. Triangulation in data collection and analysis was provided by applying multiple mixed data collection methods, namely questionnaires, interviews, eye-tracking and corpus tools. The primary variables identified by the study are the small picture size and use of dialect. Secondary variables identified include inconsistent or inadequate use of non-manual features, incoherent or non-simultaneous mouthing, careless or incorrect sign execution, too fast signing, loss of visibility against skin or clothing, omission of vital elements of sentence structure, adherence to source language structures, meaningless additions, incorrect referencing, oversimplification and violations of Deaf norms of restructuring, information transfer, gatekeeping and third person interpreting. The identification of these factors allows the construction of a series of testable hypotheses, thereby providing a broad platform for further research. Apart from pioneering corpus-driven sign language interpreting research, the study makes significant contributions to present knowledge of evaluative models, interpreting strategies and norms and systems of transcription and annotation. / Linguistics and Modern Languages / Thesis (D. Litt.et Phil. (Linguistics)
|
Page generated in 0.1489 seconds