Spelling suggestions: "subject:"1translation memory"" "subject:"atranslation memory""
11 |
Managing Terminology for Translation Using Translation Environment Tools: Towards a Definition of Best PracticesGómez Palou Allard, Marta 03 May 2012 (has links)
Translation Environment Tools (TEnTs) became popular in the early 1990s as a partial solution for coping with ever-increasing translation demands and the decreasing number of translators available. TEnTs allow the creation of repositories of legacy translations (translation memories) and terminology (integrated termbases) used to identify repetition in new source texts and provide alternate translations, thereby reducing the need to translate the same information twice. While awareness of the important role of terminology in translation and documentation management has been on the rise, little research is available on best practices for building and using integrated termbases. The present research is a first step toward filling this gap and provides a set of guidelines on how best to optimize the design and use of integrated termbases. Based on existing translation technology and terminology management literature, as well as our own experience, we propose that traditional terminology and terminography principles designed for stand-alone termbases should be adapted when an integrated termbase is created in order to take into account its unique characteristics: active term recognition, d one-click insertion of equivalents into the target text and document pretranslation. The proposed modifications to traditional principles cover a wide range of issues, including using record structures with fewer fields, adopting the TBX-Basic’s record structure, classifying records by project or client, creating records based on equivalent pairs rather concepts in cases where synonyms exist, recording non-term units and multiple forms of a unit, and using translated documents as sources. The overarching hypothesis and its associated concrete strategies were evaluated first against a survey of current practices in terminology management within TEnTs and later through a second survey that tested user acceptance of the strategies. The result is a set of guidelines that describe best practices relating to design, content selection and information recording within integrated termbases that will be used for translation purposes. These guidelines will serve as a point of reference for new users of TEnTs, as an academic resource for translation technology educators, as a map of challenges in terminology management within TEnTs that translation software developers seek to resolve and, finally, as a springboard for further research on the optimization of integrated termbases for translation.
|
12 |
Sistemas de memórias de tradução e tecnologias de tradução automática: possíveis efeitos na produção de tradutores em formação / Translation memory systems and machine translation: possible effects on the production of translation traineesTalhaferro, Lara Cristina Santos 26 February 2018 (has links)
Submitted by Lara Cristina Santos Talhaferro null (lara.talhaferro@hotmail.com) on 2018-03-07T01:06:11Z
No. of bitstreams: 1
Dissertação_LaraCSTalhaferro_2018.pdf: 4550332 bytes, checksum: 634c0356d3f9c55e334ef6a26a877056 (MD5) / Approved for entry into archive by Elza Mitiko Sato null (elzasato@ibilce.unesp.br) on 2018-03-07T15:46:44Z (GMT) No. of bitstreams: 1
talhaferro_lcs_me_sjrp.pdf: 4550332 bytes, checksum: 634c0356d3f9c55e334ef6a26a877056 (MD5) / Made available in DSpace on 2018-03-07T15:46:44Z (GMT). No. of bitstreams: 1
talhaferro_lcs_me_sjrp.pdf: 4550332 bytes, checksum: 634c0356d3f9c55e334ef6a26a877056 (MD5)
Previous issue date: 2018-02-26 / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / O processo da globalização, que tem promovido crescente circulação de informações multilíngues em escala mundial, tem proporcionado notáveis mudanças no mercado da tradução. No contexto globalizado, para manterem-se competitivos e atenderem à demanda de trabalho, a qual conta com frequentes atualizações de conteúdo e prazos reduzidos, os tradutores passaram a adotar ferramentas de tradução assistidas por computador em sua rotina de trabalho. Duas dessas ferramentas, utilizadas principalmente por tradutores das áreas técnica, científica e comercial, são os sistemas de memórias de tradução e as tecnologias de tradução automática. O emprego de tais recursos pode ter influências imprevisíveis nas traduções, sobre as quais os tradutores raramente têm oportunidade de ponderar. Se os profissionais são iniciantes ou se lhes falta experiência em determinada ferramenta, essa influência pode ser ainda maior. Considerando que os profissionais novatos tendem a utilizar cada vez mais as ferramentas disponíveis para aumentar sua eficiência, neste trabalho são investigados os possíveis efeitos do uso de sistemas de memórias de tradução e tecnologias de tradução automática, especificamente o sistema Wordfast Anywhere e um de seus tradutores automáticos, o Google Cloud Translate API, nas escolhas de graduandos em Tradução. Foi analisada a aplicação dessas ferramentas na tradução (inglês/português) de quatro abstracts designados a dez alunos do quarto ano do curso de Bacharelado em Letras com Habilitação de Tradutor da Unesp de São José do Rio Preto, divididos em três grupos: os que fizeram o uso do Wordfast Anywhere, os que utilizaram essa ferramenta para realizar a pós-edição da tradução feita pelo Google Cloud Translate API e os que não utilizaram nenhuma dessas ferramentas para traduzir os textos. Tal exame consistiu de uma análise numérica entre as traduções, com a ajuda do software Turnitin e uma análise contrastiva da produção dos alunos, em que foram considerados critérios como tempo de realização da tradução, emprego da terminologia específica, coesão e coerência textual, utilização da norma culta da língua portuguesa e adequação das traduções ao seu fim. As traduções também passaram pelo exame de profissionais das áreas sobre as quais tratam os abstracts, para avaliá-las do ponto de vista de um usuário do material traduzido. Além de realizarem as traduções, os alunos responderam a um questionário, em que esclarecem seus hábitos e suas percepções sobre as ferramentas computacionais de tradução. A análise desses trabalhos indica que a automação não influenciou significativamente na produção das traduções, confirmando nossa hipótese de que o tradutor tem papel central nas escolhas terminológicas e na adequação do texto traduzido a seu fim. / Globalization has promoted a growing flow of multilingual information worldwide, causing significant changes in translation market. In this scenario, translators have been employing computer-assisted translation tools (CAT Tools) in a proficient way to meet the demand for information translated into different languages in condensed turnarounds. Translation memory systems and machine translation are two of these tools, used especially when translating technical, scientific and commercial texts. This configuration may have inevitable influences in the production of translated texts. Nonetheless, translators seldom have the opportunity to ponder on how their production may be affected by the use of these tools, especially if they are novice in the profession or lack experience with the tools used. Seeking to examine how the work of translators in training may be influenced by translation memory systems and machine translation technologies they employ, this work investigates how a translation memory system, Wordfast Anywhere, and one of its machine translation tools, Google Cloud Translate API, may affect the choices of Translation trainees. To achieve this goal, we present an analysis of English-to-Portuguese translations of four abstracts assigned to ten students of the undergraduate Program in Languages with Major in Translation at São Paulo State University, divided into three groups: one aided by Wordfast Anywhere, one aided by Google Cloud Translate API, and one unassisted by any of these tools. This study consists of a numerical analysis, assisted by Turnitin, and a comparative analysis, whose aspects examined are the following: time spent to perform the translation, use of specific terminology, cohesion and coherence, use of standard Portuguese, and suitability for their purposes. Apart from this analysis, a group of four experts were consulted on the translations as users of their content. Finally, the students filled a questionnaire on their habits and perceptions on CAT Tools. The examination of their work suggests that automation did not influence the production of the translations significantly, confirming our hypothesis that human translators are at the core of decision-making when it comes to terminological choices and suitability of translated texts to their purpose. / 2016/07907-0
|
13 |
Automatisk kvalitetskontroll av terminologi i översättningar / Automatic quality checking of terminology in translationsEdholm, Lars January 2007 (has links)
Kvalitet hos översättningar är beroende av korrekt användning av specialiserade termer, som kan göra översättningen lättare att förstå och samtidigt minska tidsåtgång och kostnader för översättningen (Lommel, 2007). Att terminologi används konsekvent är viktigt, och något som bör granskas vid en kvalitetskontroll av exempelvis översatt dokumentation (Esselink, 2000). Det finns idag funktioner för automatisk kontroll av terminologi i flera kommersiella program. Denna studie syftar till att utvärdera sådana funktioner, då ingen tidigare större studie av detta har påträffats. För att få en inblick i hur kvalitetskontroll sker i praktiken genomfördes först två kvalitativa intervjuer med personer involverade i detta på en översättningsbyrå. Resultaten jämfördes med aktuella teorier inom området och visade på stor överensstämmelse med vad exempelvis Bass (2006) förespråkar. Utvärderingarna inleddes med en granskning av täckningsgrad hos en verklig termdatabas jämfört med subjektivt markerade termer i en testkorpus baserad på ett autentiskt översättningsminne. Granskningen visade dock på relativt låg täckningsgrad. För att öka täckningsgraden modifierades termdatabasen, bland annat utökades den med längre termer ur testkorpusen. Därefter kördes fyra olika programs funktion för kontroll av terminologi i testkorpusen jämfört med den modifierade termdatabasen. Slutligen modifierades även testkorpusen, där ett antal fel placerades ut för att få en mer idealiserad utvärdering. Resultaten i form av larm för potentiella fel kategoriserades och bedömdes som riktiga eller falska larm. Detta utgjorde basen för mått på kontrollernas precision och i den sista utvärderingen även deras recall. Utvärderingarna visade bland annat att det för terminologi i översättningar på engelska - svenska var mest fördelaktigt att matcha termdatabasens termer som delar av ord i översättningens käll- och målsegment. På så sätt kan termer med olika böjningsformer fångas utan stöd för språkspecifik morfologi. En orsak till många problem vid matchningen var utseendet på termdatabasens poster, som var mer anpassat för mänskliga översättare än för maskinell läsning. Utifrån intervjumaterialet och utvärderingarnas resultat formulerades rekommendationer kring införandet av verktyg för automatisk kontroll av terminologi. På grund av osäkerhetsfaktorer i den automatiska kontrollen motiveras en manuell genomgång av dess resultat. Genom att köra kontrollen på stickprov som redan granskats manuellt ur andra aspekter, kan troligen en lämplig omfattning av resultat att gå igenom manuellt erhållas. Termdatabasens kvalitet är avgörande för dess täckningsgrad för översättningar, och i förlängningen också för nyttan med att använda den för automatisk kontroll. / Quality in translations depends on the correct use of specialized terms, which can make the translation easier to understand as well as reduce the required time and costs for the translation (Lommel, 2007). Consistent use of terminology is important, and should be taken into account during quality checks of for example translated documentation (Esselink, 2000). Today, several commercial programs have functions for automatic quality checking of terminology. The aim of this study is to evaluate such functions since no earlier major study of this has been found. To get some insight into quality checking in practice, two qualitative interviews were initially carried out with individuals involved in this at a translation agency. The results were compared to current theories in the subject field and revealed a general agreement with for example the recommendations of Bass (2006). The evaluations started with an examination of the recall for a genuine terminology database compared to subjectively marked terms in a test corpus based on an authentic translation memory. The examination however revealed a relatively low recall. To increase the recall the terminology database was modified, it was for example extended with longer terms from the test corpus. After that, the function for checking terminology in four different commercial programs was run on the test corpus using the modified terminology database. Finally, the test corpus was also modified, by planting out a number of errors to produce a more idealized evaluation. The results from the programs, in the form of alarms for potential errors, were categorized and judged as true or false alarms. This constitutes a base for measures of precision of the checks, and in the last evaluation also of their recall. The evaluations showed that for terminology in translations of English to Swedish, it was advantageous to match terms from the terminology database using partial matching of words in the source and target segments of the translation. In that way, terms with different inflected forms could be matched without support for languagespecific morphology. A cause of many problems in the matching process was the form of the entries in the terminology database, which were more suited for being read by human translators than by a machine. Recommendations regarding the introduction of tools for automatic checking of terminology were formulated, based on the results from the interviews and evaluations. Due to factors of uncertainty in the automatic checking, a manual review of its results is motivated. By running the check on a sample that has already been manually checked in other aspects, a reasonable number of results to manually review can be obtained. The quality of the terminology database is crucial for its recall on translations, and in the long run also for the value of using it for automatic checking.
|
14 |
Semi-Automatic Translation of Medical Terms from English to Swedish : SNOMED CT in Translation / Semiautomatisk översättning av medicinska termer från engelska till svenska : Översättning av SNOMED CTLindgren, Anna January 2011 (has links)
The Swedish National Board of Health and Welfare has been overseeing translations of the international clinical terminology SNOMED CT from English to Swedish. This study was performed to find whether semi-automatic methods of translation could produce a satisfactory translation while requiring fewer resources than manual translation. Using the medical English-Swedish dictionary TermColl translations of select subsets of SNOMED CT were produced by ways of translation memory and statistical translation. The resulting translations were evaluated via BLEU score using translations provided by the Swedish National Board of Health and Welfare as reference before being compared with each other. The results showed a strong advantage for statistical translation over use of a translation memory; however, overall translation results were far from satisfactory. / Den internationella kliniska terminologin SNOMED CT har översatts från engelska till svenska under ansvar av Socialstyrelsen. Den här studien utfördes för att påvisa om semiautomatiska översättningsmetoder skulle kunna utföra tillräckligt bra översättning med färre resurser än manuell översättning. Den engelsk-svenska medicinska ordlistan TermColl användes som bas för översättning av delmängder av SNOMED CT via översättningsminne och genom statistisk översättning. Med Socialstyrelsens översättningar som referens poängsattes the semiautomatiska översättningarna via BLEU. Resultaten visade att statistisk översättning gav ett betydligt bättre resultat än översättning med översättningsminne, men över lag var resultaten alltför dåliga för att semiautomatisk översättning skulle kunna rekommenderas i detta fall.
|
15 |
Managing Terminology for Translation Using Translation Environment Tools: Towards a Definition of Best PracticesGómez Palou Allard, Marta January 2012 (has links)
Translation Environment Tools (TEnTs) became popular in the early 1990s as a partial solution for coping with ever-increasing translation demands and the decreasing number of translators available. TEnTs allow the creation of repositories of legacy translations (translation memories) and terminology (integrated termbases) used to identify repetition in new source texts and provide alternate translations, thereby reducing the need to translate the same information twice. While awareness of the important role of terminology in translation and documentation management has been on the rise, little research is available on best practices for building and using integrated termbases. The present research is a first step toward filling this gap and provides a set of guidelines on how best to optimize the design and use of integrated termbases. Based on existing translation technology and terminology management literature, as well as our own experience, we propose that traditional terminology and terminography principles designed for stand-alone termbases should be adapted when an integrated termbase is created in order to take into account its unique characteristics: active term recognition, d one-click insertion of equivalents into the target text and document pretranslation. The proposed modifications to traditional principles cover a wide range of issues, including using record structures with fewer fields, adopting the TBX-Basic’s record structure, classifying records by project or client, creating records based on equivalent pairs rather concepts in cases where synonyms exist, recording non-term units and multiple forms of a unit, and using translated documents as sources. The overarching hypothesis and its associated concrete strategies were evaluated first against a survey of current practices in terminology management within TEnTs and later through a second survey that tested user acceptance of the strategies. The result is a set of guidelines that describe best practices relating to design, content selection and information recording within integrated termbases that will be used for translation purposes. These guidelines will serve as a point of reference for new users of TEnTs, as an academic resource for translation technology educators, as a map of challenges in terminology management within TEnTs that translation software developers seek to resolve and, finally, as a springboard for further research on the optimization of integrated termbases for translation.
|
16 |
Computer-Assisted Translation: An Empirical Investigation of Cognitive EffortMellinger, Christopher Davey 28 April 2014 (has links)
No description available.
|
17 |
Vybrané syntaktické jevy v překladech z němčiny do češtiny realizovaných pomocí nástrojů CAT / Some Syntactic Phenomena Related to Computer-Aided Translation from German into CzechJurenka, Jakub January 2014 (has links)
The thesis looks at computer-aided translation (CAT), which, along with machine translation (MT), stands for the use of information technology in translation, but, unlike MT, is not of primary concern for Translation Studies research. Based on the current state of research, this thesis aims to address the syntax of CAT text output. The theoretical part first describes the working principles of CAT tools and presents them in the context of general translation studies, then defines information structure and endophoric reference as the syntactic phenomena to be analysed. Finally it provides a methodological model for the empirical part. Using a set of texts translated with CAT tools and a set of texts translated without them, the empirical part seeks to check whether translators using CAT tools tend to shift from the syntagmatic approach to text to a more paradigmatic one. To this end, an analysis is carried out to determine the interference rate in the information structure of the target texts as well as the rate of subjectively motivated shifts in distant endophoric reference. The final part of the thesis confronts the analysis results and the data reflecting the process of producing the collected texts in order to take into account other potential factors than the use of CAT tools. With regard to...
|
18 |
Outils et environnements pour l'amélioration incrémentale, la post-édition contributive et l'évaluation continue de systèmes de TA. Application à la TA français-chinois. / Tools and environments for incremental improvement, contributive post-editing and continuous evaluation of MT systems. Application to French-Chinese MT.Wang, Lingxiao 14 December 2015 (has links)
La thèse, effectuée dans le cadre d'une bourse CIFRE, et prolongeant un des aspects du projet ANR Traouiero, aborde d'abord la production, l'extension et l'amélioration de corpus multilingues par traduction automatique (TA) et post-édition contributive (PE). Des améliorations fonctionnelles et techniques ont été apportées aux logiciels SECTra et iMAG, et on a progressé vers une définition générique de la structure d'un corpus multilingue, multi-annoté et multimédia, pouvant contenir des documents classiques aussi bien que des pseudo-documents et des méta-segments. Cette partie a été validée par la création de bons corpus bilingues français-chinois, l'un d'eux résultant de la toute première application à la traduction littéraire.Une seconde partie, initialement motivée par un besoin industriel, a consisté à construire des systèmes de TA de type Moses, spécialisés à des sous-langages, en français↔chinois, et à étudier la façon de les améliorer dans le cadre d'un usage en continu avec possibilité de PE. Dans le cadre d'un projet interne sur le site du LIG et d'un projet (TABE-FC) en coopération avec l'université de Xiamen, on a pu démontrer l'intérêt de l'apprentissage incrémental en TA statistique, sous certaines conditions, grâce à une expérience qui s'est étalée sur toute la thèse.La troisième partie est consacrée à des contributions et mises à disposition de supports informatiques et de ressources. Les principales se placent dans le cadre du projet COST MUMIA de l'EU et résultent de l'exploitation de la collection CLEF-2011 de 1,5 M de brevets partiellement multilingues. De grosses mémoires de traductions en ont été extraites (17,5 M segments), 3 systèmes de TA en ont été tirés, et un site Web de support à la RI multilingue sur les brevets a été construit. On décrit aussi la réalisation en cours de JianDan-eval, une plate-forme de construction, déploiement et évaluation de systèmes de TA. / The thesis, conducted as part of a CIFRE grant, and extending one of the aspects of the ANR project Traouiero, first addresses the production, extension and improvement of multilingual corpora by machine translation (MT) and contributory post-editing (PE). Functional and technical improvements have been made to the SECTra and iMAG software produced in previous PhD theses (P.C. Huynh, H.T. Nguyen), and progress has ben made toward a generic definition of the structure of a multilingual, annotated and multi-media corpus that may contain usual documents as well as pseudo-documents (such as Web pages) and meta-segments. This part has been validated by the creation of good French-Chinese bilingual corpora, one of them resulting from the first application to literary translation (a Jules Verne novel).A second part, initially motivated by an industrial need, has consisted in building MT systems of Moses type, specialized to sub-languages, for french↔chinese, and to study how to improve them in the context of a continuous use with the possibility of PE. As part of an internal project on the LIG website and of a project (TABE-FC) in cooperation with Xiamen University, it has been possible to demonstrate the value of incremental learning in statistical MT, under certain conditions, through an experiment that spread over the whole thesis.The third part of the thesis is devoted to contributing and making available computer tools and resources. The main ones are related to the COST project MUMIA of the EU and result from the exploitation of the CLEF-2011 collection of 1.5 million partially multilingual patents. Large translation memories have been extracted from it (17.5 million segments), 3 MT systems have been produced (de-fr, en-fr, fr-de), and a website of support for multilingual IR on patents has been constructed. One also describes the on-going implementation of JianDan-eval, a platform for building, deploying and evaluating MT systems.
|
19 |
Découpage textuel dans la traduction assistée par les systèmes de mémoire de traduction / Text segmentation in human translation assisted by translation memory systemsPopis, Anna 13 December 2013 (has links)
L’objectif des études théoriques et expérimentales présentées dans ce travail était de cerner à l’aide des critères objectifs fiables un niveau optimum de découpage textuel pour la traduction spécialisée assistée par un système de mémoire de traduction (SMT) pour les langues française et polonaise. Afin de réaliser cet objectif, nous avons élaboré notre propre approche : une nouvelle combinaison des méthodes de recherche et des outils d’analyse proposés surtout dans les travaux de Simard (2003), Langlais et Simard (2001, 2003) et Dragsted (2004) visant l’amélioration de la viabilité des SMT à travers des modifications apportées à la segmentation phrastique considérée comme limitant cette viabilité. A la base des observations de quelques réalisations effectives du processus de découpage textuel dans la traduction spécialisée effectuée par l’homme sans aide informatique à la traduction, nous avons déterminé trois niveaux de segmentation potentiellement applicables dans les SMT tels que phrase, proposition, groupes verbal et nominal. Nous avons ensuite réalisé une analyse comparative des taux de réutilisabilité des MT du système WORDFAST et de l’utilité des traductions proposées par le système pour chacun de ces trois niveaux de découpage textuel sur un corpus de douze textes de spécialité. Cette analyse a permis de constater qu’il n’est pas possible de déterminer un seul niveau de segmentation textuelle dont l’application améliorerait la viabilité des SMT de façon incontestable. Deux niveaux de segmentation textuelle, notamment en phrases et en propositions, permettent en effet d’assurer une viabilité comparable des SMT. / The aim of the theoretical and experimental study presented in this work was to define with objective and reliable criterion an optimal level of textual segmentation for specialized translation from French into Polish assisted by a translation memory system (TMS). In this aim, we created our own approach based on research methods and analysis tools proposed particularly by Simard (2003), Langlais and Simard (2001, 2003) and Dragsted (2004). In order to increase the SMT performances, they proposed to eliminate a sentence segmentation level from SMT which is considered an obstacle to achieve satisfying SMT performances. On the basis of the observations of text segmentation process realized during a specialized translation made by a group of students without any computer aid, we defined three segmentation levels which can be potentially used in SMT such as sentences, clauses and noun and verb phrases. We realized then a comparative study of the influence of each of these levels on the reusability of WORDFAST translation memories and on the utility of translations proposed by the system for a group of twelve specialized texts. This study showed that it is not possible to define a unique text segmentation level which would unquestionably increase the SMT performances. Sentences and clauses are in fact two text segmentation levels which ensure the comparable SMT performances.
|
20 |
Comparaison de systèmes de traduction automatique pour la post édition des alertes météorologique d'Environnement Canadavan Beurden, Louis 08 1900 (has links)
Ce mémoire a pour but de déterminer la stratégie de traduction automatique des alertes
météorologiques produites par Environnement Canada, qui nécessite le moins d’efforts de
postédition de la part des correcteurs du bureau de la traduction. Nous commencerons
par constituer un corpus bilingue d’alertes météorologiques représentatives de la tâche de
traduction. Ensuite, ces données nous serviront à comparer les performances de différentes
approches de traduction automatique, de configurations de mémoires de traduction et de
systèmes hybrides. Nous comparerons les résultats de ces différents modèles avec le système
WATT, développé par le RALI pour Environnement Canada, ainsi qu’avec les systèmes de
l’industrie GoogleTranslate et DeepL. Nous étudierons enfin une approche de postédition
automatique. / The purpose of this paper is to determine the strategy for the automatic translation of
weather warnings produced by Environment Canada, which requires the least post-editing
effort by the proofreaders of the Translation Bureau. We will begin by developing a bilingual
corpus of weather warnings representative of this task. Then, this data will be used to
compare the performance of different approaches of machine translation, translation memory
configurations and hybrid systems. We will compare the results of these models with the
system WATT, the latest system provided by RALI for Environment Canada, as well as
with the industry systems GoogleTranslate and DeepL. Finaly, we will study an automatic
post-edition system.
|
Page generated in 0.0848 seconds