Global ETD Search

41	eDictor: da plataforma para a nuvem / eDictor: from platform to the cloud Luiz Henrique Lima Veronesi 04 February 2015 (has links) Neste trabalho, apresentamos uma nova proposta para edição de textos que fazem parte de um corpus eletrônico. Partindo do histórico de desenvolvimento do corpus Tycho Brahe e da ferramenta eDictor, propõe-se a análise de todo o processo de trabalho de criação de um corpus para obter uma forma de organização da informação mais concisa e sem redundâncias, através do uso de um único repositório de informações contendo os dados textuais e morfossintáticos do texto. Esta forma foi atingida através da criação de uma estrutura de dados baseada em unidades mínimas chamadas tokens e blocos de unidades chamados chunks. A relação entre os tokens e os chunks, da forma como considerada neste trabalho, é capaz de guardar a informação de como o texto é estruturado em sua visualização (página, parágrafos, sentenças) e na sua estrutura sintática em árvores. A base de análise é composta por todos os arquivos pertencentes ao catálogo de textos do corpus Tycho Brahe. Através desta análise, foi possível chegar a elementos genéricos que se relacionam, desconstruindo o texto e criando uma relação de pontos de início e fim relativos às palavras (tokens) e não seguindo sua forma linear. A introdução do conceito de orientação a objetos possibilitou a criação de uma relação entre unidades ainda menores que o token, os split tokens que também são tokens, pois herdam as características do elemento mais significativo, o token. O intuito neste trabalho foi buscar uma forma com o menor número possível de atributos buscando diminuir a necessidade de se criar atributos específicos demais ou genéricos de menos. Na busca deste equilíbrio, foi verificada a necessidade de se criar um atributo específico para o chunk sintático, um atributo de nível que indica a distância de um nó da árvore para o nó raiz. Organizada a informação, o acesso a ela se torna mais simples e parte-se para definição da interface do usuário. A tecnologia web disponível permite que elementos sejam posicionados na tela reproduzindo a visualização que ocorre no livro e também permite que haja uma independência entre um e outro elemento. Esta independência é o que permite que a informação trafegue entre o computador do usuário e a central de processamento na nuvem sem que o usuário perceba. O processamento ocorre em background, utilizando tecnologias assíncronas. A semelhança entre as tecnologias html e xml introduziu uma necessidade de adaptação da informação para apresentação ao usuário. A solução apresentada neste trabalho é pensada de forma a atribuir aos tokens informações que indiquem que eles fazem parte de um chunk. Assim, não seriam as palavras que pertencem a uma sentença, mas cada palavra que possuiria um pedaço de informação que a faz pertencente à sentença. Esta forma de se pensar muda a maneira como a informação é exibida. / In this work, we present a new proposal for text edition organized under an electronic corpus. Starting from Tycho Brahe corpus development history and the eDictor tool, we propose to analyze the whole work process of corpus creation in order to obtain a more concise and less redudant way of organizing information by using a single source repository for textual and morphosyntactic data. This single source repository was achieved by the creation of a data structure based on minimal significative units called tokens and grouping units named chunks. The relationship between tokens and chunks, in the way considered on this work, allows storage of information about how the text is organized visually (pages, paragraphs, sentences) and on how they are organized syntactically as represented by syntactic trees. All files referred to the Tycho Brahe corpus catalog were used as base for analysis. That way, it was possible to achieve generic elements that relate to each other in a manner that the text is deconstructed by using relative pointers to each token in the text instead of following the usual linear form. The introduction of oriented-object conception made the creation of relationship among even smaller units possible, they are the split tokens, but split tokens are also tokens, as they inherit characteristics from the most significative element (the token). The aim here was being attributeless avoiding the necessity of too specific or too vague attributes. Looking for that balance, it was verified the necessity of creating a level attribute for syntactic data that indicates the distance of a tree node to its root node. After information is organized, access to it become simpler and then focus is turned to user-interface definition. Available web technology allows the use of elements that may be positioned on the screen reproducing the way the text is viewed within a book and it also allows each element to be indepedent of each other. This independence is what allows information to travel between user computer and central processing unit at the cloud without user perception. Processing occurs in background using asynchronous technology. Resemblance between html and xml introduced a necessity of adaption to present the information to the user. The adopted solution in this work realizes that tokens must contain the information about the chunk to which they belong. So this is not a point of view where words belong to sentences, but that each word have a piece of information that make them belong to the sentence. This subtile change of behavioring changes the way information is displayed. Arquitetura web Corpus anotado Corpus eletrônico Edição filológica digital Linguística computacional Linguística de corpus Annotated corpus Computational linguistics Corpus linguistics Electronic corpus Philological digital edition Web architecture
42	Apport de la linguistique de corpus à la lexicographie bilingue (français-arabe) : macrostructure et microstructure d'un dictionnaire de collocations / The contribution of corpus linguistics to bilingual French-Arabic lexicography : macrostructure and microstructure in collocation dictionaries Al-Qaisi, Fu'ad 07 December 2015 (has links) L'objet de la présente étude est d’examiner l’apport de la linguistique de corpus à la lexicographie bilingue français-arabe. L’intérêt est porté tout particulièrement à la collocation. Ainsi, la quête commence dès la compilation du corpus jusqu'à l'intégration des collocations au lexique. Les notions fondamentales telle que la linguistique de corpus, le corpus et la collocation sont examinées. Ensuite, la recherche prend une tournure empirique qui se base sur un corpus. Pour pallier la non disponibilité des outils de traitement de corpus en langue arabe, une approche a été élaborée au sein de cette étude, que nous avons baptisée stratégie de passerelle. L’idée est de partir d’un corpus parallèle (traduit) français-arabe. Ce corpus est constitué de la version française du journal Le Monde Diplomatique, ainsi que sa traduction arabe. Le recours à un corpus parallèle a pour vocation de faciliter le repérage des phénomènes contrastifs. Les résultats obtenus seront vérifiés par la suite dans un corpus monolingue arabe (comparable) constitué de trois journaux, à savoir Alrai, Alayam, Algomhuria. Tout au long de cette partie, les résultats sont comparés dans un premiers temps entre corpus et dictionnaires, dans un deuxième temps entre types de corpus (parallèle et comparable), et dans un troisième temps entre journaux du corpus comparable (Alrai, Alayam et Algomhuria). Ensuite, un certain nombre des collocations est soumis à un examen structurel et à un examen sémantique. Ces exploitations apportent non seulement des éléments sur l’environnement collocationnel entre langue et discours, mais également sur une éventuelle approche pour la prise en compte des collocations. Des interrogations légitimes naissent au fur et à mesure des exploitations sur la ressemblance entre les collocations des deux langues. Les résultats mettent en évidence des points comme l’enchaînement collocationnel, la synonymie collocationnelle et d’autres aspects. L’étude est couronnée par la conception d’un dictionnaire informatique de collocations. Il s’agit d’un dictionnaire actif bilingue, qui s’adresse à un public arabisant et aux traducteurs. / The aim of this study is to examine the contribution of corpus linguistics to bilingual French-Arabic lexicography. We particularly focus on collocations, as our research begins with the compilation of a bilingual corpus leading up to the integration of collocations in the lexicon. Fundamentals such as corpus linguistics, corpora and collocation are examined. Our research then takes an empirical turn that is based on the use of our corpus. To overcome the unavailability of corpus processing tools in Arabic, an approach was developed in this study that we called the footbridge strategy. The idea is to start from a French-Arabic (translated) parallel corpus. This corpus consists of the French version of Le Monde Diplomatique, and its translation. Using a parallel corpus aims to facilitate the identification of contrastive phenomena. The results obtained in the translated corpus (in its Arabic component) will be subsequently checked in an Arabic monolingual corpus. The latter is a corpus consisting of three newspapers: Alrai, Alayyam, Algouhouria. Throughout the exploitation of the corpus, results are compared first between corpora and dictionaries, secondly between corpus types (parallel and comparable), and thirdly between newspapers (Alrai, Alayyam, Algouhouria). Then a number of collocations are subjected to semantic and structural review and consideration. This review process not only brings some clarifications on the environment of collocations between language and speech but also about a possible approach for their integration in the dictionary. Legitimate questions gradually arise regarding the resemblance of collocations in French and Arabic. The results highlight phenomena such as collocational chains (clusters), collocational synonyms, etc. The study culminates in the design of a computer dictionary of collocations, i.e. an active bilingual dictionary aimed at Arabic language specialists and translators. Collocation Linguistique de corpus Corpus parallèle Corpus comparable Lexicographie bilingue Lexicologie Phraséologie Collocation Corpus linguistics Parallel corpus Comparable corpus Bilingual lexicography Lexicology Idioms
43	DO NOT COVER : Störst av allt är feelingen. Om att frigöra sig från skam genom Corpus/Jewellery. Hammarberg, Sofia January 2016 (has links) Some things you cant touch, see or grab. But they are there. Always and everywhere. Silently invading every part of you, your everyday life and the things in it. The less you speak of it the more you have it. The more you have it the less you want and can address it. Shame is not logic, it is not the brain reacting, hardly our conscience, it's the body. It is truly and fully a physical feeling. For the first time I give myself permission to dig into all of these materials, I indulge in the styles and tastes that I've long felt forbidden for me. I wont be limited in my choices of symbols, coloration or aesthetics in ways that good taste and patriarchal structures have taught me to be. I am letting that guard down and diving in, using it in advantage for my work and my theme. I feast on fake pearls, glitter, shells and plastics. I turn towards what is considered shameful or bad taste and work with it, embrace it and elevate it. Not to show that is the new "right" but to justify for all of the times that I have turned away from it because of shame. To be a person with feelings of shame is to be a person that automatically will try to turn from itself. Shame is intimately entwined with femininity, it is silently inherited from generation to generation. It is experienced only by some bodies and not others. It is not being able to see your own value. It is the loneliest feeling in the world, but really a marker for something much bigger and deeper than one individual. It is materialized everywhere around us. It is not me, but it is not not me. Corpus Jewellery Shame Femininity Corpus Jewellery Skam Femininitet
44	Acquisition de schémas prédicatifs verbaux en japonais / Verbal predicate-frame acquisition in Japanese Marchal, Pierre 15 October 2015 (has links) L'acquisition de connaissances relatives aux constructions verbales est une question importante pour le traitement automatique des langues, mais aussi pour la lexicographie qui vise à documenter les nouveaux usages linguistiques. Cette tâche pose de nombreux enjeux, techniques et théoriques. Dans le cadre de cette thèse, nous nous intéressons plus particulièrement à deux aspects fondamentaux de la description du verbe : la notion d'entrée lexicale et la distinction entre arguments et circonstants. A la suite de précédentes études en traitement automatique des langues et en linguistique nous faisons l'hypothèse qu’il n’y a pas de distinction marquée entre homonymes et quasi-synonymes ; de même, nous posons qu’il existe un continuum entre arguments et circonstants. Nous proposons une chaîne de traitement complète pour l'acquisition de schémas prédicatifs verbaux en japonais à partir d'un corpus non étiqueté de textes journalistiques. Cette chaîne de traitement intègre la notion d'argumentalité au processus de création des entrées lexicales et met en œuvre une modélisation de ces deux continuums. La ressource produite a fait l'objet d'une évaluation comparative qualitative, qui a permis de mettre en évidence la difficulté des ressources linguistiques à décrire de nouvelles données, plaidant par là même pour une lexicologie s'inscrivant dans le cadre épistémologique de la linguistique de corpus. / Lexical knowledge acquisition of verbal constructions is an important issue for natural language processing as well as lexicography, which aims at referencing emerging linguistic usages. Such a task implies numerous challenges, technical as well as theoretical. In this thesis, we had a closer look at two fundamental aspects of the description of the verb: the notion of lexical item and the distinction between arguments and adjuncts. Following up on studies in natural language processing and linguistics, we embrace the hypothesis that there is no clear distinction between homonyms and quasi-synonyms, and the hypothesis of a continuum between arguments and adjuncts. We provide a complete approach to lexical knowledge acquisition of verbal constructions from an untagged news corpus. The acquisition process makes use of the notion of argumenthood, and builds models of the two continuums. Our lexicon has been evaluated on a qualitative and comparative basis. Siding with lexicography anchored in the theoretical framework of corpus linguistics, we show the difficulty of using lexical resources to describe as yet unseen data. Japonais Linguistique de corpus Syntaxe Verbe Japanese Corpus linguistics Syntax Verb
45	Proficiência escrita em inglês especializado : estudo de corpus de abstracts em Medicina, Nutrição e Farmácia Freitas, Ana Luiza Pires de January 2016 (has links) Este trabalho explora o desenvolvimento da proficiência escrita em língua inglesa no âmbito da produção de abstracts, no campo das Ciências da Saúde. O objetivo é contribuir para a elaboração de materiais instrucionais, para a formação de educadores linguísticos e para os avanços do campo de ensino e aprendizagem de English for Academic Purposes. A pesquisa reuniu, descreveu e analisou um corpus de 180.170 palavras, com abstracts das áreas de Medicina, Nutrição e Farmácia, com base nos fundamentos da Linguística de Corpus, da Linguística das Linguagens Especializadas e dos Estudos em English for Academic Purposes. A unidade analítica do estudo são os pacotes lexicais (lexical bundles), sequências recorrentes de palavras empregadas nos textos. Para o trabalho de extração e identificaçāo de pacotes lexicais, estabeleceu-se o critério de extensão de 4 palavras gráficas e frequência e distribuição mínimas de 5 ocorrências em, pelo menos, 5 textos diferentes, tanto para o acervo internacional, quanto para o brasileiro. Foram extraídos 96 pacotes lexicais do subcorpus internacional, com 90.098 palavras, e 88 sequências recorrentes do subcorpus brasileiro, com 90.072 palavras. Com base nas métricas de frequência e variabilidade lexical, constatam-se distinções nos modos de narrar a ciência entre as duas partes do acervo. O subcorpus brasileiro apresentou maior repetição de associações de palavras e um maior emprego de lexical bundles para expressar a finalidade e registrar a realização do trabalho acadêmico. O subcorpus internacional, por sua vez, caracterizou-se pela diversidade dos pacotes lexicais, pela objetividade da narrativa e pelo uso de feixes de palavras para destacar o fazer científico propriamente dito. Embora os resultados obtidos sejam específicos para o corpus reunido, os achados reforçam a importância de educadores linguísticos e desenhistas de programas de ensino e aprendizagem reconhecerem as peculiaridades dos contextos de produção dos abstracts, para que a prática pedagógica seja sintonizada às necessidades do aprendiz. Na conclusão do estudo, sāo apresentadas sugestōes para aproveitamento dos resultados em atividades de ensino. / This research explores the development of written proficiency in English regarding the production of abstracts in the filed of Health Sciences. As such, it aims at contributing to the advances in the studies of English for Academic Purposes by fostering language teachers’ development, and by providing support to the creation of instructional materials. Based on Corpus Linguistics, Linguistics for Specialized Languages and English for Academic Purposes, the investigation put together, described and analyzed a corpus of 180,170 words, comprised by abstracts in Medicine, Nutrition and Pharmacy. The analytical study units are lexical bundles, recurrent strings of words used in texts. For the bundles extraction and identification, an extent criterion of 4 graphic words and a frequency and minimum distribution of 5 occurrences, in at least 5 different texts in each of the two parts of the corpus, were established. 96 lexical bundles were extracted from the international subcorpus, which adds up to 90,098 words, whilst 88 recurrent word sequences were obtained from the Brazilian subcorpus, which amounts to 90,072 words. Regarding the metrics of lexical frequency and variability, the two data segments uncovered distinctions in the ways of building up a scientific narrative. A larger repetition of word associations and a higher use of lexical bundles to express purpose and to highlight the achievement of the academic endeavor were noticed in the Brazilian subcorpus. The international subcorpus, on the other hand, features more diverse recurrent strings of words, a concise prose and the use of extended collocations to highlight the scientific enterprise in itself. Although these findings are specific to the corpus studied, they bring out the usefulness of language educators’ and program designers’ awareness of the peculiarities of the different abstract production contexts, so that pedagogical practice can be attuned to learners’ needs. Suggestions for the application of the findings in teaching tasks are provided in the concluding part of the investigation. Linguística aplicada Lingüística de corpus Corpus Proficiência Língua inglesa Resumo
46	La restricción constitucional en el proceso de hábeas corpus Bazán Lora, Aurelio Luis January 2018 (has links) Identifica el problema general y especifico planteado desde el ámbito del derecho constitucional, la misma que contribuirá con el establecimiento de una adecuada administración de justicia que conlleve a cautelar en todo tiempo y circunstancia el respeto e irrestricto de la libertad individual que deberá primar en todas las instancias de la administración de justicia, teniéndose en cuenta que en el proceso de la investigación y elaboración de este estudio, ha primado el carácter metodológico de tipo observacional, descriptiva y explicativa. El estudio, análisis y evaluación efectuada sobre las restricciones normativas en los procesos constitucionales de Habeas Corpus tiene por finalidad la interpretación y aplicación ponderada de los contenidos esenciales de los derechos fundamentales consagrados en la Constitución Política del Estado -1993; por lo tanto, constituye prima facie la viva expresión de la manifestación real y concreta de todos los valores comunes que la sociedad aspira alcanzar como una organización social jurídica; siendo así, se pretende que los derechos tutelen los principios de los proceso de garantía constitucionales no deben colisionar entre sí, ni mucho menos producir afectación al contenido esencial de otros principios de los derechos fundamentales, tales como al derecho de la igualdad ante la Ley, al derecho de legítima defensa, al derecho de la pluralidad de instancias y al derecho al debido proceso. / Tesis Hábeas corpus - Perú Hábeas corpus Derechos humanos - Perú Derecho Penal
47	As that-clauses em abstracts escritos por alunos brasileiros de universidades públicas: uma análise baseada em corpus / The that-clauses in abstracts written by Brazilian students of public universities: a corpus-based analysis Alves, Anna Luisa Lopes 10 April 2018 (has links) Submitted by Anna Luisa Lopes Alves (annalopes.alves@gmail.com) on 2018-06-08T22:15:57Z No. of bitstreams: 1 Dissertação de mestrado Anna Luisa Lopes Alves.pdf: 2081259 bytes, checksum: 32e721e4f4fb4a131e1c4124214549e2 (MD5) / Approved for entry into archive by Elza Mitiko Sato null (elzasato@ibilce.unesp.br) on 2018-06-11T13:12:04Z (GMT) No. of bitstreams: 1 alves_all_me_sjrp.pdf: 2081259 bytes, checksum: 32e721e4f4fb4a131e1c4124214549e2 (MD5) / Made available in DSpace on 2018-06-11T13:12:04Z (GMT). No. of bitstreams: 1 alves_all_me_sjrp.pdf: 2081259 bytes, checksum: 32e721e4f4fb4a131e1c4124214549e2 (MD5) Previous issue date: 2018-04-10 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / A escrita de abstracts tem sido estudada por pesquisadores de diferentes áreas (BATHIA; 1993; SWALES; FEAK, 2009; DAYRELL, 2011) para que seja possível observar quais padrões são mais utilizados por autores de revistas de impacto internacional a fim de preparar pesquisadores para a escrita deste gênero textual. Nesta dissertação, objetivamos analisar como a Linguística de Corpus e o Corpus de aprendizes (DUTRA, et al, 2016), composto por textos de alunos brasileiros de universidades públicas, e o Corpus MICUSP (RÖMER, 2012), contendo textos de alunos nativos e não-nativos de língua inglesa da Universidade de Michigan, podem ser utilizados no desenvolvimento da escrita acadêmica de abstracts (BIBER, 2007; SWALES e FEAK, 2009; GLASMAN-DEAL, 2010) e como alunos universitários utilizam as that-clauses para reportarem as descobertas, resultados e conclusão de seus trabalhos em abstracts. Após coleta e compilação dos nossos corpora de análises, utilizamos o programa linguístico e computacional Sketch Engine (KILGARRIFF, et al, 2014) para criar linhas de concordância com a palavra that e, a partir dos resultados, foi possível observar que tanto na escrita de abstracts de alunos de universidades brasileiras, quanto na escrita de abstracts de alunos da Universidade de Michigan, as estruturas mais utilizadas no desenvolvimento da escrita das descobertas, resultados e conclusões, foram as seguintes: sujeito + verbo no passado simples + that e sujeito + verbo no presente simples + that, respectivamente, se alinhando com resultados de pesquisa de autores como Biber, et al. (2007) e Glasman-Deal (2010). O resultado final demonstrou que é possível utilizar a Linguística de Corpus para o ensino e aprendizagem de Inglês com Fins Acadêmicos e obter um desfecho positivo na escrita acadêmica de abstracts, de acordo com a verificação das estruturas utilizadas pelos alunos em seus resumos e, também é possível desenvolver atividades didáticas utilizando Corpus de aprendizes para que a autonomia do aluno como pesquisador seja desenvolvida. Ademais, notamos com esta pesquisa que a LC é uma área muito abrangente que permite diversos tipos de trabalhos e aplicações, visando análises de dados linguísticos para os mais diversos fins, e, ressaltamos, de acordo com Berber Sardinha (2004), que a utilização de corpus no ensino de línguas estrangeiras é uma maneira de auxiliar o ensino e aprendizagem de uma segunda língua tornando seu aprendiz mais autônomo. / The writing of abstracts has been studied by researchers from different areas (BATHIA, 1993; SWALES e FEAK, 2009; DAYRELL, 2011) so that it is possible to observe which standards are most used by authors of international impact journals in order to prepare researchers for the writing of this textual genre. In this dissertation, we aim to analyze how Corpus Linguistics and Learner Corpus (DUTRA, et al., 2016), composed of texts by Brazilian students of public universities, and Corpus MICUSP (RÖMER, 2012), with texts written by native and non-native speakers of English can be used to help academic writing (BIBER, 2007, SWALES and FEAK, 2009; GLASMAN-DEAL, 2010) and how college students use that-clauses to report discoveries, results and conclusion of their researchs in abstracts. After collecting and compiling our corporus of analysis , we used the linguistic and computational program Sketch Engine (KILGARRIFF, et al, 2014) to generate concordance lines with the search word “that” and, based on the results, it was possible to observe that in both, writing abstracts of students of Brazilian universities, as well as in the writing of abstracts of University of Michigan students, the most used structures in the development of the writing of discoveries, results and conclusions, were the following structures: subject + verb in past simple + that, and subject + present simple + that, respectively, aligning with search results of authors such as Biber, et al. (2007) and Glasman-Deal (2010). The final result showed that it is possible to use Corpus Linguistics for teaching and learning English for Academic Purposes and to obtain a positive result in the academic writing of abstracts, according to the verification of the structures used by the students in their abstracts. It is also possible to develop teaching activities using Learner Corpus so that the autonomy of the student as a researcher is developed. In addition, we observed with this research that Corpus Linguistics is a very broad area that allows different types of works and applications, aiming at analysis of linguistic data for the most diverse purposes, and, according to Berber Sardinha (2004), that the use of corpus in the teaching of foreign languages is a way of helping the teaching and learning of a second language making learners more autonomous. Linguística de Corpus e ensino IFA Abstracts That-clauses Corpus linguistics and teaching EAP
48	Um estudo sobre a representação da figura feminina nas traduções de The Chronicles of Narnia: The Silver Chair à luz dos Estudos da Tradução Baseados em Corpus / A study on the representation of the female figure in translations of The Chronicles of Narnia: The Silver Chair, in light of Corpus-based translation studies Morante, Naiara Gomes [UNESP] 27 April 2018 (has links) Submitted by Naiara Gomes Morante (nanamorante@hotmail.com) on 2018-06-25T15:12:31Z No. of bitstreams: 1 IMPRIMIR - VERSÃO FINAL.pdf: 1068945 bytes, checksum: e12fa8ce763274b6ecf214b29c815931 (MD5) / Approved for entry into archive by Priscila Carreira B Vicentini null (priscila@fclar.unesp.br) on 2018-06-26T12:43:03Z (GMT) No. of bitstreams: 1 morante_ng_me_arafcl.pdf: 1062172 bytes, checksum: f8ac3fefa8b9d3e98687953759f304a3 (MD5) / Made available in DSpace on 2018-06-26T12:43:03Z (GMT). No. of bitstreams: 1 morante_ng_me_arafcl.pdf: 1062172 bytes, checksum: f8ac3fefa8b9d3e98687953759f304a3 (MD5) Previous issue date: 2018-04-27 / A presente dissertação volta-se para o estudo do léxico de uma obra traduzida, tendo como base o uso de corpus. Escolhemos para análise o livro The Chronicles of Narnia: the Silver Chair (1953), do escritor C. S. Lewis, e suas traduções para a língua portuguesa “As Crônicas de Nárnia: a cadeira de prata” – tradução de Paulo Mendes Campos – e para a língua espanhola Las Crónicas de Narnia: la silla de plata – Tradução de María Rosa Duhart Silva. Selecionamos três vocábulos para análise, os quais se relacionam à representação das principais personagens femininas e a aspectos simbólicos da narrativa: Jill, Witch e owl. Estas são palavras-chave no corpus que compilamos e foram extraídas por meio de ferramentas específicas do software WordSmith Tools. Investigamos, então, os três vocábulos de acordo com seu cotexto (texto ao redor da palavra de busca) e com seu contexto. Para tal, levamos em consideração os dados fornecidos pelo programa, como o número de ocorrências dos vocábulos escolhidos no texto de partida (TP) e nos textos de chegada (TCs) e os pressupostos teóricos dos Estudos da Tradução Baseados em Corpus (BAKER, 1993, 1995, 1996). Os resultados das análises apontam para a criação de novos sentidos nos TCs de acordo com o léxico selecionado pelos tradutores, levando o leitor a conceituar de diferentes modos as personagens citadas, a partir de suas impressões ao ter acesso ao TC. / The present dissertation, based on subsidies from Corpus Linguistics consists of a study aimed at studying the lexicon of a literary work. We have chosen to analyze the book The Chronicles of Narnia: the Silver Chair (1953), by the writer C. S. Lewis, and its translations into the Portuguese language As Crônicas de Nárnia: a cadeira de prata – translated by Paulo Mendes Campos –, and into the Spanish language Las Crónicas de Narnia: la silla de plata – translated by María Rosa Duhart Silva. We have selected three words for analysis, which relate to the representation of the main female characters and the symbolic aspects of the narrative: Jill, Witch and owl. These are keywords in the corpus we compiled and they were extracted using specific tools from WordSmith Tools software. We then investigated the three words according to their co-text (text around the search word) and its context. To do so, we took into account the data provided by the program, such as the number of occurrences of the words chosen in the source text (ST) and in the target texts (TTs) and the theoretical assumptions of Corpus-based Translation Studies (BAKER, 1993, 1995, 1996). The results of the analyzes point to the creation of new meanings in the TTs according to the lexicon selected by the translators, leading the reader to conceptualize the characters mentioned in different ways, from their impressions upon access to the TT. Tradução Linguística de Corpus Léxico Nárnia Translation Corpus Linguistics Lexicon Narnia
49	Proficiência escrita em inglês especializado : estudo de corpus de abstracts em Medicina, Nutrição e Farmácia Freitas, Ana Luiza Pires de January 2016 (has links) Este trabalho explora o desenvolvimento da proficiência escrita em língua inglesa no âmbito da produção de abstracts, no campo das Ciências da Saúde. O objetivo é contribuir para a elaboração de materiais instrucionais, para a formação de educadores linguísticos e para os avanços do campo de ensino e aprendizagem de English for Academic Purposes. A pesquisa reuniu, descreveu e analisou um corpus de 180.170 palavras, com abstracts das áreas de Medicina, Nutrição e Farmácia, com base nos fundamentos da Linguística de Corpus, da Linguística das Linguagens Especializadas e dos Estudos em English for Academic Purposes. A unidade analítica do estudo são os pacotes lexicais (lexical bundles), sequências recorrentes de palavras empregadas nos textos. Para o trabalho de extração e identificaçāo de pacotes lexicais, estabeleceu-se o critério de extensão de 4 palavras gráficas e frequência e distribuição mínimas de 5 ocorrências em, pelo menos, 5 textos diferentes, tanto para o acervo internacional, quanto para o brasileiro. Foram extraídos 96 pacotes lexicais do subcorpus internacional, com 90.098 palavras, e 88 sequências recorrentes do subcorpus brasileiro, com 90.072 palavras. Com base nas métricas de frequência e variabilidade lexical, constatam-se distinções nos modos de narrar a ciência entre as duas partes do acervo. O subcorpus brasileiro apresentou maior repetição de associações de palavras e um maior emprego de lexical bundles para expressar a finalidade e registrar a realização do trabalho acadêmico. O subcorpus internacional, por sua vez, caracterizou-se pela diversidade dos pacotes lexicais, pela objetividade da narrativa e pelo uso de feixes de palavras para destacar o fazer científico propriamente dito. Embora os resultados obtidos sejam específicos para o corpus reunido, os achados reforçam a importância de educadores linguísticos e desenhistas de programas de ensino e aprendizagem reconhecerem as peculiaridades dos contextos de produção dos abstracts, para que a prática pedagógica seja sintonizada às necessidades do aprendiz. Na conclusão do estudo, sāo apresentadas sugestōes para aproveitamento dos resultados em atividades de ensino. / This research explores the development of written proficiency in English regarding the production of abstracts in the filed of Health Sciences. As such, it aims at contributing to the advances in the studies of English for Academic Purposes by fostering language teachers’ development, and by providing support to the creation of instructional materials. Based on Corpus Linguistics, Linguistics for Specialized Languages and English for Academic Purposes, the investigation put together, described and analyzed a corpus of 180,170 words, comprised by abstracts in Medicine, Nutrition and Pharmacy. The analytical study units are lexical bundles, recurrent strings of words used in texts. For the bundles extraction and identification, an extent criterion of 4 graphic words and a frequency and minimum distribution of 5 occurrences, in at least 5 different texts in each of the two parts of the corpus, were established. 96 lexical bundles were extracted from the international subcorpus, which adds up to 90,098 words, whilst 88 recurrent word sequences were obtained from the Brazilian subcorpus, which amounts to 90,072 words. Regarding the metrics of lexical frequency and variability, the two data segments uncovered distinctions in the ways of building up a scientific narrative. A larger repetition of word associations and a higher use of lexical bundles to express purpose and to highlight the achievement of the academic endeavor were noticed in the Brazilian subcorpus. The international subcorpus, on the other hand, features more diverse recurrent strings of words, a concise prose and the use of extended collocations to highlight the scientific enterprise in itself. Although these findings are specific to the corpus studied, they bring out the usefulness of language educators’ and program designers’ awareness of the peculiarities of the different abstract production contexts, so that pedagogical practice can be attuned to learners’ needs. Suggestions for the application of the findings in teaching tasks are provided in the concluding part of the investigation. Linguística aplicada Lingüística de corpus Corpus Proficiência Língua inglesa Resumo
50	Description du ɓaka, une langue oubanguienne du Cameroun / A description of ɓaka, an Ubangian language of Cameroon Djoupee, Bertille 23 November 2017 (has links) Il s’agit d’une description grammaticale du ɓaka, une langue oubanguienne (Niger-Congo). L’analyse se fonde sur un corpus recueilli sur le terrain, dans la région de l’Est-Cameroun (département du Haut-Nyong). Le corpus représente 1h 36mn de paroles spontanées qui ont été traitées sous Toolbox, Elan et Praat puis analysées dans une perspective structurale fonctionnaliste. La thèse comprend trois parties. La première regroupe une introduction et l’analyse phonologique. La seconde partie est consacrée à l’établissement des catégories grammaticales. Le ɓaka étant une langue à faible morphologie, c’est à partir de critères syntaxiques que quinze catégories ont été identifiées : Verbe, Nom, Nom relationnel, Pronom personnel, Pronom, Adjectif, Adverbe, Numéral, Prédicatif, Préposition, Subordinatif, Coordinatif, Interjection, Onomatopée et Modalité. Je présente, pour chaque catégorie définie, une étude des formes et de son fonctionnement. La troisième partie qui porte sur la syntaxe présente le syntagme nominal, le syntagme verbal et la prédication non verbale qui sont les éléments fondamentaux de la structuration de cette langue. La prédication non verbale combine le recours à des prédicatifs non verbaux et à la construction d’énoncé sans prédicatif dédié fondée sur la juxtaposition de deux éléments dont j’analyse les caractéristiques. J’aborde ensuite l’énoncé complexe, et traite en particulier des connecteurs entre propositions que sont les coordinatifs et les subordinatifs, puis des procédés de topicalisation et de focalisation qui manifestent la hiérarchie dans l’énoncé. Une bibliographie et une annexe présentant la transcription de trois textes du corpus terminent ce travail. / Ɓaka is an Ubangian language of the Niger-Congo language family. The grammatical description is based on a text corpus that was collected during fieldwork in the department of Haut Nyong in the East Province of Cameroon. The corpus consists of recordings (1h and 36 min) of spontaneous speech, which were annotated in Toolbox, Elan and Praat and then analyzed from a structuralist-functionalist perspective. The thesis is divided into three parts. Part 1 contains the introduction and the phonological analysis. Part 2 is dedicated to defining the word classes. As Ɓaka is a language with little morphology, the following 15 word classes were identified through syntactic criteria: verb, noun, relational noun, personal pronoun, pronoun, adjective, adverb, numeral, predicator, preposition, subordinator, coordinator, interjection, onomatopoeia and modal. For each of these defined word classes, a study of their forms and functions is presented. Part 3 deals with the syntax of Ɓaka, more precisely with the noun phrase, the verb phrase and non-verbal predication, which are the fundamental structuring units of this language. Non-verbal predication encompasses both the use of non-verbal predicators as well as constructions that contain no dedicated predicators and are based on two juxtaposed elements, whose characteristic features are analyzed in detail. Part 3 is also concerned with complex sentences. It examines coordinating and subordinating connectors as well as topicalization and focalization strategies, which reflect hierarchical relations in the sentence. The thesis concludes with a bibliography and an appendix containing three transcribed texts from the corpus. Cameroun Niger-Congo Syntaxe Corpus Cameroon Niger-Congo Syntax Corpus

Search results