Global ETD Search

251	O traduzir como terceira margem: aspectos da narrativa rosiana Campo Geral na versão traduzida para o italiano por Edoardo Bizzarri Cruz, Aline Matos da [UNESP] 17 August 2012 (has links) (PDF) Made available in DSpace on 2014-06-11T19:29:49Z (GMT). No. of bitstreams: 0 Previous issue date: 2012-08-17Bitstream added on 2014-06-13T19:39:19Z : No. of bitstreams: 1 cruz_am_me_sjrp_parcial.pdf: 71795 bytes, checksum: 96d31603d74de6f098a599b0f40bf210 (MD5) Bitstreams deleted on 2015-01-16T10:37:51Z: cruz_am_me_sjrp_parcial.pdf,Bitstream added on 2015-01-16T10:38:36Z : No. of bitstreams: 1 000698140.pdf: 531708 bytes, checksum: 55d142506424b833c74bad88df537e0e (MD5) / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Este trabalho pretende analisar três aspectos temáticos da narrativa rosiana “Campo Geral”, a saber, o olhar, enquanto representação da inadequação de Miguilim perante a vida; as estórias, como símbolo de sua sensibilidade poética; e a religiosidade, no que tange ao desenvolvimento de sua fé ao longo do texto. Com o intuito de observarmos como o tradutor Edoardo Bizzarri os reescreveu na língua italiana, nosso objetivo é o de refletir sobre o texto em italiano, partindo das seguintes questões: como se deram as escolhas do tradutor com relação aos vocábulos cujo significado é desconhecido ou diversificado para o leitor italiano? Suas soluções tradutórias seriam mais explicativas ou criativas? Para tanto, buscamos entender o ato de traduzir como sendo uma analogia da terceira margem, ou seja, como uma ação que perpassa tanto a língua de partida (margem de cá), quanto à de chegada (margem de lá), fazendo da tradução uma linguagem própria, que se constrói em meio as diferenças entre as línguas. Dessa forma, para embasarmos nossa proposta, partimos de alguns aportes teóricos propostos por Walter Benjamin (2001), no que diz respeito à questão da traduzibilidade de cada texto, e de Antoine Berman (2007), ao enxergar a tradução como sendo experiência nela mesma, ou seja, que nasce da vivencia de quem a está conduzindo. Nesse sentido, nossa metodologia configurou-se dentro do tema proposto, levantando as principais características do texto original e analisando-as em concomitância com o texto traduzido, além disso, recorremos ao uso das ferramentas computacionais do programa WordSmith Tools, propostas pela Linguística de Corpus, com as quais pudemos ter acesso às palavras-chaves relacionadas aos temas em questão, bem como às linhas de concordâncias e número de recorrência dessas palavras no... / The present study aims at analyzing three thematic aspects of ‘Campo Geral’, by Guimarães Rosa. Those are: the look, as representing Miguilim’s unfitness towards life; the stories, as a symbol of his poetic sensitivity; and the religiosity, regarding his faith development throughout the text. In order to observe how the translator Edoardo Bizarri rewrote them in Italian, our goal is to think about the text in Italian, posing the following questions: what lead the translator’s lexical choices when facing words the meaning of which are unknown or diverse for the Italian reader? His translation solutions are more explanatory or creative? In order to conduct such research, we sought to understand the act of translating as an analogy for the third margin. In other words, it is an action that goes through the source language (this margin) and the target language (that other margin), making the translation a new language that shines through even the differences between both languages. Therefore, we used some of the concepts by Walter Benjamin (2001) as a theoretical basis regarding the ability of being translated in each text, and by Antoine Berman (2007), for whom the translation is an experience itself born from the translator’s active experience. Our methodology was shaped within the proposed topic, indicating the main features from the source text and analyzing them at the same time as the translated text. Besides that, we used tools from the program WordSmith Tools, related to the Corpus Linguistics, with the aid of which we could have access to the keywords related to the topics in question, as well as to concordance lines and number of occurrences in the text. Knowing that the books by Guimarães Rosa are full of local images and refined linguistic constructions, we observed that the translation options by Bizarri seek to maintain... (Complete abstract click electronic access below) Lingüística de corpus Corpus linguistics
252	A generic and open framework for multiword expressions treatment : from acquisition to applications Ramisch, Carlos Eduardo January 2012 (has links) The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a challenge for NLP applications. This kind of linguistic construction is not only arbitrary but also much more frequent than one would initially guess. This thesis investigates the behaviour of MWEs across different languages, domains and construction types, proposing and evaluating an integrated methodological framework for their acquisition. There have been many theoretical proposals to define, characterise and classify MWEs. We adopt generic definition stating that MWEs are word combinations which must be treated as a unit at some level of linguistic processing. They present a variable degree of institutionalisation, arbitrariness, heterogeneity and limited syntactic and semantic variability. There has been much research on automatic MWE acquisition in the recent decades, and the state of the art covers a large number of techniques and languages. Other tasks involving MWEs, namely disambiguation, interpretation, representation and applications, have received less emphasis in the field. The first main contribution of this thesis is the proposal of an original methodological framework for automatic MWE acquisition from monolingual corpora. This framework is generic, language independent, integrated and contains a freely available implementation, the mwetoolkit. It is composed of independent modules which may themselves use multiple techniques to solve a specific sub-task in MWE acquisition. The evaluation of MWE acquisition is modelled using four independent axes. We underline that the evaluation results depend on parameters of the acquisition context, e.g., nature and size of corpora, language and type of MWE, analysis depth, and existing resources. The second main contribution of this thesis is the application-oriented evaluation of our methodology proposal in two applications: computer-assisted lexicography and statistical machine translation. For the former, we evaluate the usefulness of automatic MWE acquisition with the mwetoolkit for creating three lexicons: Greek nominal expressions, Portuguese complex predicates and Portuguese sentiment expressions. For the latter, we test several integration strategies in order to improve the treatment given to English phrasal verbs when translated by a standard statistical MT system into Portuguese. Both applications can benefit from automatic MWE acquisition, as the expressions acquired automatically from corpora can both speed up and improve the quality of the results. The promising results of previous and ongoing experiments encourage further investigation about the optimal way to integrate MWE treatment into other applications. Thus, we conclude the thesis with an overview of the past, ongoing and future work. Linguagem natural Linguística computacional Natural language processing Computational linguistics Multiword expressions Lexical acquisition Machine translation Lexicography Corpus linguistics
253	Bases teórico-metodológicas para elaboração de um glossário bilíngue (português-inglês) de treinamento de força : subsídios para o tradutor Dornelles, Márcia dos Santos January 2015 (has links) O terminógrafo, ao elaborar um produto terminográfico bilíngue para tradutores, deve preocupar-se não só em repertoriar, nas duas línguas, os termos próprios de uma (sub)área do conhecimento, mas também em apresentá-los inseridos em suas combinatórias típicas, ou seja, associados aos elementos que a eles se combinam em nível sintagmático, de forma recorrente nos textos daquela especialidade. Isso porque o tradutor precisa produzir um texto de chegada adequado ao padrão de linguagem em foco, de forma a espelhar o modus dicendi daquele campo. Assim, seu texto soará natural à comunidade de leitores, evitando-se ruídos na comunicação. Diante da falta de produtos terminográficos bilíngues sobre Treinamento de Força (TF), dirigido a tradutores, esta investigação tem como objetivo central apresentar bases teórico-metodológicas para a elaboração de um glossário português-inglês da terminologia do TF. Esse glossário é aqui apresentado como um protótipo, uma amostra de um todo, destinado a auxiliar especialmente tradutores brasileiros que trabalhem na direção português→inglês, mas que pode ser aproveitado também por pesquisadores e estudantes dessa temática que precisem produzir artigos científicos em inglês. Ele inclui guia do usuário, uma árvore de domínio em português do TF, lista de termos em português e 30 exemplares de fichas terminológicas em formato estendido. Outro objetivo do estudo é oferecer uma descrição do comportamento dos termos em português e inglês, e das unidades fraseológicas especializadas (UFE) eventivas (BEVILACQUA, 2003; 2004) em português no âmbito dos artigos científicos sobre TF. Como referencial teórico, valemo-nos dos princípios da Teoria Comunicativa da Terminologia (TCT) e dos fundamentos e diretrizes da Linguística de Corpus (LC). Seguir a TCT (CABRÉ, 1999a; 1999b; 2001a; 2001b; 2003; 2009) implica adotar o termo como objeto central de estudo e concebê-lo, antes de tudo, como uma unidade lexical da língua natural que adquire valor especializado dentro de um contexto especializado, segundo critérios semânticos, discursivos e pragmáticos. Seguir a LC (BIBER, 2012; BERBER SARDINHA, 2004) implica uma visão probabilística da língua, pressupondo que, embora muitos traços linguísticos sejam possíveis teoricamente, não ocorrem com a mesma frequência. Ganham realce no estudo os temas da variação terminológica, da tradução funcional e do artigo científico como gênero especializado. Nosso corpus de estudo é constituído de 70 artigos de periódicos científicos de destaque no âmbito do TF, escritos originalmente em português e inglês. São, portanto, dois subcorpora, um em cada língua, que são comparáveis. Para exploração e análise do corpus, utilizamos o software AntConc (ANTHONY, 2011), especialmente as funcionalidades keyword list, n-grams e concordance. Como material de apoio, utilizamos livros-texto e artigos científicos de referência sobre TF, um glossário particular pré-existente de Educação Física, a Terminologia Anatômica Internacional, o Google Acadêmico, o Wikipédia, entre outros. Também contamos com a colaboração de dois consultores especialistas em TF. A pesquisa contempla, então, uma parte teórica e uma parte aplicada que se inter-relacionam e se inserem na dupla face da Terminologia, visto que há uma descrição de uma linguagem especializada a partir de um dado ponto de vista teórico e o desenho de um produto concreto. / When designing a bilingual terminographic product for translators, a terminographer must be concerned not only with including, in both languages, the specific terms of a (sub)field of knowledge, but also with presenting these terms within their typical phraseological structures, that is, associated with the elements they combine with syntagmatically and recurrently in the texts of that domain. This is because a translator needs to produce a target text appropriate to the language pattern in focus, so as to reflect the modus dicendi of that specialized field. In this way, the text produced will sound much more natural to the community of readers, thereby avoiding noise in communication. Given the lack of bilingual terminographic products on Strength Training (ST), addressed to translators, the main purpose of this research study is to provide theoretical and methodological foundations for the development of a Portuguese-English glossary of ST terminology. This glossary is presented here as a prototype – a sample of a whole – especially designed to assist Brazilian translators working in the Portuguese to English direction, but it can also be useful for researchers and students of this subject to produce scientific papers in English. It includes a user guide, a domain tree of ST in Portuguese, a list of terms in Portuguese, and 30 sample terminology records in extended format. Another objective of the study is to provide a description of the behavior of terms in Portuguese and English, and of eventive specialized phraseological units (BEVILACQUA, 2003; 2004) in Portuguese on ST scientific articles. As theoretical framework, we based on the principles of the Communicative Theory of Terminology (CTT) and on the foundations and guidelines of Corpus Linguistics (CL). Following CTT (CABRÉ, 1999a; 1999b; 2001a; 2001b; 2003; 2009) implies adopting the term as the central object of study and conceiving it, first of all, as a lexical unit of natural language that acquires specialized value within a specialized context, according to semantic, discursive and pragmatic criteria. Following CL (BIBER, 2012; BERBER SARDINHA, 2004) implies a probabilistic viewpoint of language, assuming that, although many linguistic features are possible theoretically, they do not occur with the same frequency. The topics of terminological variation, functional approach to translation, and the scientific article as a specialized genre are also highlighted in the study. Our corpus consists of 70 articles from leading scientific journals on ST, originally written in Portuguese and English. They are two comparable subcorpora, one in each language. For the exploration and analysis of the corpus, we used the AntConc software (ANTHONY, 2011), especially the tools keyword list, n-grams and concordance. As support material, we used textbooks and reference scientific papers on ST, a pre-existing personal glossary of Physical Education, the International Anatomical Terminology, Google Scholar, Wikipedia, among others. We also had the collaboration of two expert consultants in ST. Therefore, the research embraces a theoretical part and an applied part that interrelate and fall into the double face of Terminology, since there is a description of a specialized language from a given theoretical point of view and the design of a concrete product. Terminologia Lingüística de corpus Terminografia Glossário Treinamento de força Communicative theory of terminology Corpus linguistics Terminography Bilingual glossary Strength training
254	CompilaÃÃo, anotaÃÃo e anÃlise linguÃstico-computacional de um corpus de textos literÃrios dos sÃculos XIX e XX: corpus Coelho Neto / Compilation, annotation and linguistic and computational analysis of corpus Coelho Netto (CCN), a corpus of literary texts of 19th and 20th centuries Francimary MacÃdo Martins 06 June 2014 (has links) nÃo hÃ / Esta tese Ã a compilaÃÃo, anotaÃÃo morfossintÃtica e anÃlise linguÃstico-computacional de um corpus de textos literÃrios dos sÃc. XIX e XX: o Corpus Coelho Netto (CCN), contendo textos dos romances A Conquista e TurbilhÃo e contos do livro SertÃo. O trabalho estÃ na interface da LinguÃstica de Corpus e da LinguÃstica Computacional (BERBER SARDINHA, 2000, 2003, 2004, 2005, 2009; BERBER SARDINHA; ALMEIDA, 2008; OLIVEIRA, 2009; BIDERMAN, 1998, 2001; ALUÃSIO; ALMEIDA, 2006; SHEPHERD, 2012; MACENERY E WILSON, 2001; LEECH, 2004; ALVES; TAGNIN, 2012; ALENCAR, 2009, 2010a, 2010b, 2011a, 2011b, 2013a, 2013b). O CCN contÃm 53.080 (cinquenta e trÃs mil e oitenta) tokens (pontuaÃÃo e palavras). A compilaÃÃo consiste nas etapas de seleÃÃo, coleta de textos e manipulaÃÃo; nesta sÃo realizadas a limpeza, ediÃÃo e atualizaÃÃo dos textos (ALUÃSIO; ALMEIDA, 2006), para depois ser submetido Ã anotaÃÃo morfossintÃtica e anÃlise linguÃstico-computacional, com o objetivo de obter dados que comprovem ou nÃo o uso âexcessivoâ de adjetivos, de verbos e de advÃrbios em âmente, demonstrando a diversidade lexical nos textos de Coelho Netto, constatando se o que a crÃtica modernista dizia a respeito do escritor era procedente. A anotaÃÃo morfossintÃtica foi realizada pelo etiquetador automÃtico Aelius, modelo AeliusHunPos, um software livre em Python que utiliza a biblioteca Natural Language Toolkit â NLTK (BIRD; KLEIN; LOPER, 2009), no prÃ-processamento de textos, na construÃÃo de etiquetador morfossintÃtico e na anotaÃÃo de corpora com auxÃlio de revisÃo humana (ALENCAR, 2010a, 2013a, 2013b), e que foi treinado no Corpus HistÃrico do PortuguÃs Tycho Brahe (CHPTB). A compilaÃÃo e anotaÃÃo do CCN envolve outras aÃÃes como a reavaliaÃÃo da acurÃcia desse etiquetador em textos literÃrios. Os resultados da pesquisa revelaram que: o AeliusHunpos ao anotar os textos do CCN demonstrou maior acurÃcia que em outros textos jÃ anotados, de 97,9%; que o modelo AeliusHunPos mostrou um desempenho muito alÃm ao anotar os corpora que com o modelo AeliusMaxEnt; e que, apÃs a seleÃÃo e correÃÃo manual dos 10% dos corpora anotados e gerados arquivos padrÃo gold, sugerimos um melhoramento dos aproximados 3% de erros cometidos pelo etiquetador, visando o aumento de sua acurÃcia. Quanto Ãs analises realizadas com os dados obtidos no CCN constatamos que: a diversidade lexical, especificamente quanto a verbos, adjetivos e advÃrbios em âmente, declarada como exagerada pela crÃtica Ã Coelho Netto nÃo procede, pois seus textos sÃo ricos, mas quando comparados aos textos de AluÃsio Azevedo e Camilo Castelo Branco, o Corpus de ComparaÃÃo, apresentam riqueza vocabular similar ao CCN, como expostos nos resultados. / This thesis is the compilation, morphosyntactic annotation and linguistic and computational analysis of a corpus of literary texts of 19th and 20th centuries: Corpus Coelho Netto (CCN), containing texts of the novels A Conquista and TurbilhÃo and short stories of the book SertÃo. The work is in the Corpus Linguistics and Computational Linguistics interface (BERBER SARDINHA, 2000, 2003, 2004, 2005, 2009; BERBER SARDINHA; ALMEIDA, 2008; OLIVEIRA, 2009; BIDERMAN, 1998, 2001; ALUÃSIO; ALMEIDA, 2006; SHEPHERD, 2012; MACENERY AND WILSON, 2001; LEECH, 2004; ALVES; TAGNIN, 2012; ALENCAR, 2009, 2010a, 2010b, 2011a, 2011b, 2013a, 2013b). The CCN contains 53.080 (fifty-three thousand and eighty) tokens. The compilation consists of the steps selection, collection off texts and handling; in which cleaning, editing and updating of texts (ALUÃSIO; ALMEIDA, 2006), and then be submitted to the morphosyntactic annotation and linguistic-computational analysis, with the goal of obtaining data to show whether or not the "excessive" use of adjectives, verbs and adverbs in ââmenteâ, demonstrating the lexical diversity in Coelho NettoÂs texts, noting if what the modernist critics said about the writer was correct. The annotation was performed by automatic tagger Aelius, AeliusHunPos model, free software in Python that uses the Natural Language Toolkit â NLTK library (BIRD; KLEIN; LOPER, 2009), in the pre-processing of texts, in the construction of morphosyntactic tagger and the automatic annotation of corpora with the help of human review (ALENCAR, 2010a, 2013a, 2013b), and it was trained in the Historical Corpus of Tycho Brahe Portuguese (CHPTB). The compilation and annotation CCN involves other actions such as revaluation the accuracy of this tagger in literary texts. The search results indicated that: AeliusHunpos demonstrated better performance than other texts already noted (97.9 %); AeliusHunPos model showed a far beyond performance by annotating corpora with AeliusMaxEnt model; and that, after selection and manual correction of 10% annotated corpora and generated gold standard files, it is suggested an improvement of the approximate 3% of errors by the tagger, in order to increase its accuracy. Regarding the analyzes performed with the CCN, it was found that: lexical diversity - about verbs, adjectives and adverbs in ââmenteâ considered exaggerated by critics to Coelho Netto unfounded, because his texts are rich, but when compared to the texts by AluÃsio Azevedo and Camilo Castelo Branco, comparison of corpus, present vocabulary richness similar to CCN, as exposed in the results. LinguÃstica de Corpus LinguÃstica Computacional Etiquetagem MorfossintÃtica AeliusHunPos Coelho Netto Corpus Linguistics Computational Linguistics Morphosyntactic tagging AeliusHunPos Coelho Netto LINGUISTICA APLICADA
255	Colocações verbais em um corpus de aprendizes brasileiros de inglês / Verbal collocations in a corpus of Brazilian learners of English Danilo Suzuki Murakami 22 March 2016 (has links) Muitas pesquisas reconhecem a importância das colocações para o aprendizado da língua inglesa. Contudo, poucos estudos investigaram o tema na escrita de aprendizes brasileiros de inglês. Esta pesquisa examina o papel das colocações verbais em um subcorpus do EF-Cambridge Open Language Database (EFCAMDAT) composto por redações de aprendizes brasileiros de inglês de nível avançado. A abordagem metodológica adotada neste estudo é baseada em técnicas da Linguística de Corpus. Para essa investigação, foi elaborada uma classificação semiautomática de todos os verbos com o auxílio de um programa de anotação de corpora. Em geral, os resultados mostram que praticamente uma em cada cinco combinações entre um verbo e um substantivo é uma colocação. No entanto, os aprendizes não empregam colocações verbais com sucesso mesmo sendo de nível avançado de aprendizado. As colocações verbais apresentaram desvios em 25% dos casos. O principal tipo de inadequação é o uso de um verbo inapropriado causado pela influência do português. Um pequeno número de estruturas sintáticas também pode ser responsável por desvios colocacionais. Mais pesquisas sobre esse tópico precisam ser conduzidas para a total compreensão dos fatores que determinam a taxa de sucesso. Os achados devem contribuir para a área de aprendizagem de inglês por brasileiros. / There is a growing body of literature that recognizes the importance of collocations in English language learning. However, few studies have investigated the use of collocations in the writing of Brazilian learners of English. This research examines the role of verbal collocations in a subcorpus of the EF-Cambridge Open Language Database (EFCAMDAT). The subcorpus comprises writings by advanced learners of English from Brazil. The methodological approach taken in this study is based on Corpus Linguistics. For this investigation, a semi-automatic classification of all verbs was applied with the aid of a computer program for annotation of text. Overall, the results indicate that nearly one out of every five combinations between a verb and a noun is a collocation and that learners are not completely successful in the use of verbal collocations despite their advanced level of learning. The use of verbal collocations was found to be deviant in 25% of the cases. The main type of inadequacy was the use of an inappropriate verb caused by the influence of Portuguese. A small number of syntactic patterns may also have been responsible for collocational deviations. More research on this topic needs to be undertaken before full comprehension of the factors that determine success rate. The findings should make a contribution to the field of English learning by Brazilians. Colocação Corpus de aprendizes Ensino de língua estrangeira Língua inglesa Linguística de Corpus Collocation Corpus Linguistics English studies Language teaching Learner corpus
256	Dubliners\' sob a lupa da lingüística de corpus: uma contribuição para a análise e a avaliação da tradução literária / Dubliners\' s under the Corpus Linguistics: a contribution to the evaluation of literary translation Lourdes Bernardes Gonçalves 08 November 2006 (has links) Esta tese procura demonstrar a valiosa contribuição da Lingüística de Corpus na análise do texto literário e na avaliação da tradução literária. O corpus é formado pelos textos de Dubliners (1914), uma coletânea de contos de James Joyce, e duas traduções dessa obra, ambas intituladas Dublinenses, uma de Hamilton Trevisan (1964), a outra de José Roberto O Shea (1993). Primeiramente é apresentado um panorama da Lingüística de Corpus, especialmente como uma abordagem que apresenta interfaces com os Estudos da Tradução e a Análise Literária. Em seguida é feita uma análise da obra original e, logo após, uma avaliação das traduções. Para constatar a efetiva contribuição da Lingüística de Corpus, a análise do texto original e das traduções foi realizada seguindo duas abordagens diferentes, a não computacional e a computacional. Os dados levantados foram comparados, o que permitiu estabelecer que a Lingüística de Corpus de fato representa uma abordagem que traz significativa contribuição aos processos de análise do texto literário e à avaliação de traduções literárias. Assim, foi proposto um modelo híbrido de avaliação de tradução literária, que combina características da abordagem tradicional e da Lingüística de Corpus. Esse modelo foi testado com quatro contos de Dubliners. / This thesis aims at demonstrating the valuable contribution of Corpus Linguistics in the analysis of literary texts and in the evaluation of literary translation. The selected texts are Dubliners (1914), a collection of short stories by James Joyce, and two translations thereof, both entitled Dublinenses, one by Hamilton Trevisan (1964), and the other by José Roberto O Shea (1993). Firstly, an analysis of the original work is carried out and, after that, the evaluation of translations. In order to verify the effective contribution of Corpus Linguistics, an analysis of the original text and its translations was performed, using two different approaches, a non computational as well as a computational one. The data thus obtained were compared and, as a result, it could be established that Corpus Linguistics really represents an approach which makes a significant contribution to the processes of literary text analysis and the evaluation of literary translations. Therefore, a model for the evaluation of literary translations was proposed, bringing together characteristics of the traditional approach and that of Corpus Linguistics. This model was then tested on four short stories from Dubliners. Análise literária Avaliação de tradução literária Dubliners Lingüística de Corpus Tradução literária Corpus Linguistics Dubliners Literary analysis Literary translation Quality assessment
257	Por que textos de divulgação são mais difíceis para aprendizes de leitura com necessidades específicas do que textos científicos? Um estudo direcionado pelo corpus / Why dissemination texts are more difficult for reading learners with special needs than scientific texts? A study directed by the corpus Marlene Dezidério Andreetto 09 April 2013 (has links) Com a crescente demanda por certificações como pré-requisito para o acesso aos cursos de pós-graduação para profissionais de diferentes áreas, a habilidade de leitura tornou-se um campo de grandes possibilidades para a pesquisa de ensino-aprendizagem para fins específicos. O objetivo desta pesquisa é fazer um levantamento das estruturas léxico-gramaticais que possam impedir a fluência da leitura dos diferentes gêneros relacionados à área médica. A partir daí, oferecer ao professor maiores subsídios na preparação e na organização do conteúdo programático em cursos de leitura instrumental, além de possibilitar a elaboração de atividades que possam auxiliá-lo nessa tarefa. Para tanto, foi compilado um corpus de estudo composto por três subcorpora: um acadêmico (NEJM) e dois de divulgação na área de saúde (Health News e WebMD), a partir de textos disponíveis na internet. A compilação do corpus foi feita de forma manual e automática, esta com a ajuda da ferramenta BootCat. A análise do corpus foi feita de forma semiautomática por meio do AntConc 3.3 w 2012. A metodologia baseada em corpus permitiu que o levantamento de alguns padrões léxico-gramaticais recorrentes fosse feito de uma forma adequada e eficiente, além de ter demonstrado ser possível a elaboração de um corpus individualizado, não muito laborioso, que possa atender às necessidades dos aprendizes de leitura com necessidades específicas. Diferenças significativas foram encontradas na comparação dos três subcorpora. Confirmando estudos anteriores, como, por exemplo, o de Biber (1998), a forte presença da voz passiva nos textos acadêmicos demonstrou ser uma característica importante para a organização de um conteúdo programático, bem como a forte presença dos verbos regulares. A ordem dos adjetivos nesse tipo de gênero foi outro tópico levantado. Nos textos de divulgação, a presença de características da linguagem falada, como a omissão do that nas orações substantivas e a sequência to be likely, apresentou um forte indício da dificuldade encontrada pelos aprendizes brasileiros de inglês na leitura desse tipo de gênero textual. / Given the growing number of certification in the academic area in English, reading has become of utmost importance for professionals in different areas. Brazilian professionals in the area of health are required to get a proficiency certificate to be accepted in the post graduate programme of an important university in São Paulo. Having that in mind, the aim of this study is to analyse the lexico-grammatical structures and items that can hinder reading comprehension of texts in the medical area. Results may aid teachers in getting more subsidies for the preparation and organisation of their reading syllabi in a course for specific purposes. To be able to accomplish that task we have compiled a corpus which is composed of three subcorpora: one with research articles from the New England Journal of Medicine online version and the two other ones with articles also published on the internet (Health News and WebMD). The compilation was both manual and automatic using the Bootcat. The corpus analysis was initially performed with the help of Wordsmith Tools 5.0 and continued with the AntConC 3.3.w 2012. The use of Corpus Linguistics has allowed the researcher to observe more effectively some of the lexico-grammatical patterns which were typical and recurrent in the analised genres. Besides, this study has demonstrated that it is possible to build a specialized corpus without much effort so as to help teachers and students alike. Significant differences were found as a result of the comparison of the three subcorpora. This study has confirmed the massive presence of the passive voice in academic texts (Biber, 1998). In the information genre, it has revealed its proximity to the spoken variety, which might be the cause for the difficulty encountered by most Brazilian students when reading this kind of articles. Conhecimento prévio Gêneros Textuais Leitura instrumental Linguística de corpus Background knowledge Corpus linguistics Genre Reading for specific purposes
258	As vozes de Chico Buarque em inglês: tradução e linguística de corpus / Chico Buarque\'s voices in English: translation and corpus linguistics Sérgio Marra de Aguiar 20 December 2010 (has links) O notável talento de Chico Buarque em lidar com as palavras, assim como sua participação nas traduções de suas obras literárias para o inglês, são o fio condutor desta pesquisa de doutorado, em que se investigaram as traduções de Estorvo, Benjamim e Budapeste. A Linguística de Corpus foi utilizada como base metodológica para investigação do corpus de estudo, composto pelas referidas obras originais e traduzidas. Utilizando-se o programa computacional WordSmith Tools, de Mike Scott, foi extraída do corpus de estudo, uma lista de palavras-chave que serviu como ponto de partida para uma análise qualitativa das traduções, focando o aspecto da recuperação da criatividade lexical do autor pelos tradutores. Os resultados desta análise levaram a crer que houve, por parte dos tradutores, Peter Bush, Clifford Landers e Alison Entrekin, um empenho significativo para recriar as supostas intenções semânticas e estilísticas do escritor. Tal conclusão foi corroborada por entrevistas que este pesquisador conduziu com o autor, com os três tradutores e com a editora Liz Calder. A Linguística de Corpus, por sua vez, mostrou-se eficaz não só como metodologia para exploração de um corpus literário, mas também como uma abordagem, na medida em que revelou aspectos das traduções que não se havia cogitado investigar. / Chico Buarques notable talent in dealing with words as well as his participation in the translations of his books into English are the core of this doctoral dissertation, which investigated the translations of Estorvo (Turbulence), Benjamim (Benjamin) and Budapeste (Budapest) into English. Corpus Linguistics was used as a methodological basis for investigating the study corpus, constituted by both original and translated works. With the computer program WordSmith Tools, developed by Mike Scott, a list of keywords was extracted from the corpus, which served as a starting point for a qualitative analysis of the translations. The focus was to investigate if the authors creativity was recovered by his translators. The results led to believe that the translators, Peter Bush, Clifford Landers and Alison Entrekin, made a significant effort to recreate Chico Buarques semantic and stylistic endeavors. This conclusion was corroborated by the oral interviews conducted by this researcher with the author, the three translators and the publisher Liz Calder. Corpus Linguistics, in turn, was effective not only as a methodology to explore a literary corpus, but also as an approach to the extent that it revealed aspects of the translations that were not previously considered worthy of invetigation.
259	Bases teórico-metodológicas para elaboração de um glossário bilíngue (português-inglês) de treinamento de força : subsídios para o tradutor Dornelles, Márcia dos Santos January 2015 (has links) O terminógrafo, ao elaborar um produto terminográfico bilíngue para tradutores, deve preocupar-se não só em repertoriar, nas duas línguas, os termos próprios de uma (sub)área do conhecimento, mas também em apresentá-los inseridos em suas combinatórias típicas, ou seja, associados aos elementos que a eles se combinam em nível sintagmático, de forma recorrente nos textos daquela especialidade. Isso porque o tradutor precisa produzir um texto de chegada adequado ao padrão de linguagem em foco, de forma a espelhar o modus dicendi daquele campo. Assim, seu texto soará natural à comunidade de leitores, evitando-se ruídos na comunicação. Diante da falta de produtos terminográficos bilíngues sobre Treinamento de Força (TF), dirigido a tradutores, esta investigação tem como objetivo central apresentar bases teórico-metodológicas para a elaboração de um glossário português-inglês da terminologia do TF. Esse glossário é aqui apresentado como um protótipo, uma amostra de um todo, destinado a auxiliar especialmente tradutores brasileiros que trabalhem na direção português→inglês, mas que pode ser aproveitado também por pesquisadores e estudantes dessa temática que precisem produzir artigos científicos em inglês. Ele inclui guia do usuário, uma árvore de domínio em português do TF, lista de termos em português e 30 exemplares de fichas terminológicas em formato estendido. Outro objetivo do estudo é oferecer uma descrição do comportamento dos termos em português e inglês, e das unidades fraseológicas especializadas (UFE) eventivas (BEVILACQUA, 2003; 2004) em português no âmbito dos artigos científicos sobre TF. Como referencial teórico, valemo-nos dos princípios da Teoria Comunicativa da Terminologia (TCT) e dos fundamentos e diretrizes da Linguística de Corpus (LC). Seguir a TCT (CABRÉ, 1999a; 1999b; 2001a; 2001b; 2003; 2009) implica adotar o termo como objeto central de estudo e concebê-lo, antes de tudo, como uma unidade lexical da língua natural que adquire valor especializado dentro de um contexto especializado, segundo critérios semânticos, discursivos e pragmáticos. Seguir a LC (BIBER, 2012; BERBER SARDINHA, 2004) implica uma visão probabilística da língua, pressupondo que, embora muitos traços linguísticos sejam possíveis teoricamente, não ocorrem com a mesma frequência. Ganham realce no estudo os temas da variação terminológica, da tradução funcional e do artigo científico como gênero especializado. Nosso corpus de estudo é constituído de 70 artigos de periódicos científicos de destaque no âmbito do TF, escritos originalmente em português e inglês. São, portanto, dois subcorpora, um em cada língua, que são comparáveis. Para exploração e análise do corpus, utilizamos o software AntConc (ANTHONY, 2011), especialmente as funcionalidades keyword list, n-grams e concordance. Como material de apoio, utilizamos livros-texto e artigos científicos de referência sobre TF, um glossário particular pré-existente de Educação Física, a Terminologia Anatômica Internacional, o Google Acadêmico, o Wikipédia, entre outros. Também contamos com a colaboração de dois consultores especialistas em TF. A pesquisa contempla, então, uma parte teórica e uma parte aplicada que se inter-relacionam e se inserem na dupla face da Terminologia, visto que há uma descrição de uma linguagem especializada a partir de um dado ponto de vista teórico e o desenho de um produto concreto. / When designing a bilingual terminographic product for translators, a terminographer must be concerned not only with including, in both languages, the specific terms of a (sub)field of knowledge, but also with presenting these terms within their typical phraseological structures, that is, associated with the elements they combine with syntagmatically and recurrently in the texts of that domain. This is because a translator needs to produce a target text appropriate to the language pattern in focus, so as to reflect the modus dicendi of that specialized field. In this way, the text produced will sound much more natural to the community of readers, thereby avoiding noise in communication. Given the lack of bilingual terminographic products on Strength Training (ST), addressed to translators, the main purpose of this research study is to provide theoretical and methodological foundations for the development of a Portuguese-English glossary of ST terminology. This glossary is presented here as a prototype – a sample of a whole – especially designed to assist Brazilian translators working in the Portuguese to English direction, but it can also be useful for researchers and students of this subject to produce scientific papers in English. It includes a user guide, a domain tree of ST in Portuguese, a list of terms in Portuguese, and 30 sample terminology records in extended format. Another objective of the study is to provide a description of the behavior of terms in Portuguese and English, and of eventive specialized phraseological units (BEVILACQUA, 2003; 2004) in Portuguese on ST scientific articles. As theoretical framework, we based on the principles of the Communicative Theory of Terminology (CTT) and on the foundations and guidelines of Corpus Linguistics (CL). Following CTT (CABRÉ, 1999a; 1999b; 2001a; 2001b; 2003; 2009) implies adopting the term as the central object of study and conceiving it, first of all, as a lexical unit of natural language that acquires specialized value within a specialized context, according to semantic, discursive and pragmatic criteria. Following CL (BIBER, 2012; BERBER SARDINHA, 2004) implies a probabilistic viewpoint of language, assuming that, although many linguistic features are possible theoretically, they do not occur with the same frequency. The topics of terminological variation, functional approach to translation, and the scientific article as a specialized genre are also highlighted in the study. Our corpus consists of 70 articles from leading scientific journals on ST, originally written in Portuguese and English. They are two comparable subcorpora, one in each language. For the exploration and analysis of the corpus, we used the AntConc software (ANTHONY, 2011), especially the tools keyword list, n-grams and concordance. As support material, we used textbooks and reference scientific papers on ST, a pre-existing personal glossary of Physical Education, the International Anatomical Terminology, Google Scholar, Wikipedia, among others. We also had the collaboration of two expert consultants in ST. Therefore, the research embraces a theoretical part and an applied part that interrelate and fall into the double face of Terminology, since there is a description of a specialized language from a given theoretical point of view and the design of a concrete product. Terminologia Lingüística de corpus Terminografia Glossário Treinamento de força Communicative theory of terminology Corpus linguistics Terminography Bilingual glossary Strength training
260	A generic and open framework for multiword expressions treatment : from acquisition to applications Ramisch, Carlos Eduardo January 2012 (has links) The treatment of multiword expressions (MWEs), like take off, bus stop and big deal, is a challenge for NLP applications. This kind of linguistic construction is not only arbitrary but also much more frequent than one would initially guess. This thesis investigates the behaviour of MWEs across different languages, domains and construction types, proposing and evaluating an integrated methodological framework for their acquisition. There have been many theoretical proposals to define, characterise and classify MWEs. We adopt generic definition stating that MWEs are word combinations which must be treated as a unit at some level of linguistic processing. They present a variable degree of institutionalisation, arbitrariness, heterogeneity and limited syntactic and semantic variability. There has been much research on automatic MWE acquisition in the recent decades, and the state of the art covers a large number of techniques and languages. Other tasks involving MWEs, namely disambiguation, interpretation, representation and applications, have received less emphasis in the field. The first main contribution of this thesis is the proposal of an original methodological framework for automatic MWE acquisition from monolingual corpora. This framework is generic, language independent, integrated and contains a freely available implementation, the mwetoolkit. It is composed of independent modules which may themselves use multiple techniques to solve a specific sub-task in MWE acquisition. The evaluation of MWE acquisition is modelled using four independent axes. We underline that the evaluation results depend on parameters of the acquisition context, e.g., nature and size of corpora, language and type of MWE, analysis depth, and existing resources. The second main contribution of this thesis is the application-oriented evaluation of our methodology proposal in two applications: computer-assisted lexicography and statistical machine translation. For the former, we evaluate the usefulness of automatic MWE acquisition with the mwetoolkit for creating three lexicons: Greek nominal expressions, Portuguese complex predicates and Portuguese sentiment expressions. For the latter, we test several integration strategies in order to improve the treatment given to English phrasal verbs when translated by a standard statistical MT system into Portuguese. Both applications can benefit from automatic MWE acquisition, as the expressions acquired automatically from corpora can both speed up and improve the quality of the results. The promising results of previous and ongoing experiments encourage further investigation about the optimal way to integrate MWE treatment into other applications. Thus, we conclude the thesis with an overview of the past, ongoing and future work. Linguagem natural Linguística computacional Natural language processing Computational linguistics Multiword expressions Lexical acquisition Machine translation Lexicography Corpus linguistics

Search results