Global ETD Search

1	Towards the development and application of representative lexicographic corpora for the Gabonese languages Soami, Leandre Serge 03 1900 (has links) Thesis (DLitt (Afrikaans and Dutch))--University of Stellenbosch, 2010. / ENGLISH ABSTRACT: The compilation of dictionaries is a laborious activity and it takes time, money and staff to achieve the objectives of any dictionary project. Many dictionaries have been compiled using the lexicographers’ personal intuition and guessing rather than being corpus based. That resulted in some dictionaries often being criticised by users because of the lack of representation of some important lexical items. This can probably be explained by the fact that most of these dictionaries were compiled in an era when theoretical lexicography was lacking or not well established. The last decades have witnessed the emergence of metalexicography as a theory directed also at dictionary planning in order to enhance the quality of lexicographic practice and the way in which the management and the compilation of dictionaries are dealt with. The planning of dictionaries takes into account not only the gathering of language material to be used but also the way in which this material will be treated and presented on both the macrostructural and the microstructural level as well as in the front matter texts and the back matter texts. In order to enhance the quality of the presentation in dictionaries, this dissertation pleads in favour of the formulation of a data collection policy that takes into consideration all the different sources of material, written and spoken, used in the different phases of the compilation of a dictionary. The three phases that form the main focus of this study are the material acquisition phase, the material preparation phase and the material processing phase. The involvement of the speech community in the compilation of a lexicographic corpus ensures the collection of representative and balanced data, and the different needs of that community are central to the dictionary project. The different language materials can be organised into different corpus types. The efficiency of a corpus resides in its capacity to provide different data types that can be included in the comment on semantics and the comment on form of each article in the central list of each dictionary. Some dictionaries lack a good representation of data in both these comments in the different articles. However, languages such as the Gabonese languages are in a privileged situation because they can still avoid the mistakes of other dictionary compilers by investing in corpus-based dictionaries at this early stage. Therefore, the establishment of lexicographic units with multifunctional tasks can play an important role. In a multilingual environment such as Gabon the issue of language status needs to be dealt with carefully because it is realistic to choose a certain number of languages to function as official languages. Different alphabets are presented in this study and realistic choices are made. The way in which the language material is organised will impact on the quality of the macrostructure and microstructure; this is essential because dictionaries are consulted most of the time for the spelling of a given lexical item, for a translation equivalent or for the explanation of the meaning of a lemma sign. The computerisation of a corpus is a focal point and needs to be done in a satisfactory manner that presents a clean and helpful corpus in order to provide the lexicographer with useful statistics, frequency word lists and the different concordance lines that are very important for the wording of definitions and the extraction of example sentences. This is why a corpus is seen as an indispensable tool in the improvement of the macro- and the microstructure of any type of dictionary. / AFRIKAANSE OPSOMMING: Die saamstel van woordeboeke is ’n moeisame aktiwiteit, en dit verg tyd, geld en personeel om die doelstellings van ’n woordeboekprojek te bereik. Talle woordeboeke is op grond van die navorsers se persoonlike intuïsie en raaiwerk saamgestel, in stede daarvan dat dit korpusgebaseerd is. Die gevolg is dat baie woordeboeke dikwels deur gebruikers gekritiseer word weens die gebrek aan verteenwoordiging van enkele belangrike leksikale items. Dít kan moontlik verklaar word deur die feit dat die meeste van hierdie woordeboeke saamgestel is in ’n era waartydens teoretiese leksikografie gebrekkig en nie goed gevestig was nie. In die afgelope dekades het metaleksikografie na vore getree as a teorie wat op woordeboekbeplanning gerig is ten einde die gehalte van die leksikografie-praktyk en die manier waarop die bestuur en samestelling van woordeboeke hanteer word, te verbeter. By die beplanning van woordeboeke word nie net die versameling taalmateriaal wat gebruik kan word in berekening gebring nie, maar ook die manier waarop hierdie materiaal op sowel makro- as mikrostrukturele vlakke, asook in die voorwerk en die agterwerk, hanteer en aangebied gaan word. Ten einde die gehalte van die aanbieding in woordeboeke te verbeter, lewer hierdie proefskrif ’n pleidooi vir die formulering van ’n dataversamelingsbeleid wat al die verskillende materiaalbronne, hetsy skriftelik of mondelings, wat in die verskillende stadia van die samestelling van ’n woordeboek gebruik word, in ag neem. Die drie stadia wat die hooffokus van hierdie studie is, is die stadia waarin die materiaal aangeskaf, voorberei en verwerk word. Die spraakgemeenskap se betrokkenheid by die saamstel van ’n leksikografiese korpus verseker die versameling van verteenwoordigende en gebalanseerde data, en die verskillende behoeftes van sodanige gemeenskap is die kern van die woordeboekprojek. Die verskillende taalmateriale kan in verskillende korpussoorte georden word. Die doeltreffendheid van ’n korpus berus op die vermoë daarvan om verskillende datasoorte te verskaf wat in die kommentaar op semantiek en die kommentaar op vorm van elke item in die sentrale lys van elke woordeboek ingesluit kan word. Sommige woordeboeke toon ’n gebrek aan goeie verteenwoordiging van data in albei hierdie soorte kommentaar in die verskillende items. Tale soos die Gaboenese tale is egter in ’n bevoorregte posisie, aangesien hulle nog die foute van ander woordeboeksamestellers kan vermy deur op hierdie vroeë stadium in korpusgebaseerde woordeboeke te belê. Die stigting van leksikografiese eenhede met multifunksionele take kan dus ’n belangrike rol speel. In ’n veeltalige omgewing soos Gaboen moet die kwessie van taalstatus versigtig hanteer word, aangesien dit realisties is om ’n sekere hoeveelheid tale as amptelike tale te kies. Verskillende alfabette word in hierdie studie aangebied en realistiese keuses word gemaak. Die manier waarop die taalmateriaal georden is, sal ’n uitwerking op die makro- en mikrostruktuur hê; dit is van belang omdat woordeboeke meestal vir die spelling van ’n gegewe leksikale item, vir ’n vertaalekwivalent of vir die verklaring van die betekenis van ’n lemmateken geraadpleeg word. Die rekenarisering van ’n korpus is ’n belangrike aspek en moet op ’n bevredigende wyse uitgevoer word wat ’n skoon en nuttige korpus lewer ten einde die leksikograaf van goeie statistieke, frekwensiewoordlyste en die verskillende konkordansielyne te voorsien, wat baie belangrik is vir die skryf van definisies en die onttrekking van voorbeeldsinne. Om hierdie rede word ’n korpus as ’n onmisbare instrument in die verbetering van die makro- en mikrostruktuur van enige soort woordeboek beskou. Lexicographic corpora Gabonese languages -- Lexicography Word frequencies Concordance lines Dissertations -- Afrikaans language Theses -- Afrikaans language
2	Working closely with corpora. Proposta de ensino de colocações adverbiais em inglês para negócios, sob a luz da Linguística de Corpus / Working closely with corpora. A proposal for teaching typical adverbial collocations in Business English, based on Corpus Linguistics principles Andrea Geroldo dos Santos 25 October 2011 (has links) O objetivo deste trabalho é apresentar uma proposta para o ensino de colocações adverbiais em inglês, típicas da área de negócios, à luz da Linguística de Corpus. Para isso, compilamos um corpus monolíngue em inglês britânico e americano, composto de periódicos de negócios e relatórios de empresas disponíveis on-line, num total de 2.310.143 palavras. Com a ajuda do Wordsmith Tools 5.0 e de cálculos estatísticos (escores T e Informação Mútua), levantamos as vinte e cinco colocações adverbiais mais recorrentes no corpus de estudo e as analisamos quanto aos padrões léxicogramaticais. Feita a análise, selecionamos dentre essas unidades colocacionais aquelas que poderiam ser abordadas em sala de aula, segundo os critérios estabelecidos pela pesquisadora, com base na abordagem DDL de Tim Johns (1991), na modelagem de Carter (1998) e nas críticas feitas ao uso das linhas de concordância no ensino de línguas. Elaboramos, assim, exercícios que foram aplicados em três estudos-piloto, realizados com alunos de inglês para negócios de um instituto de idiomas privado, nos seguintes níveis: pré-intermediário, intermediário e intermediário superior. Nos três estudos, comprovamos a aplicabilidade dos exercícios e a importância da Linguística de Corpus como abordagem ao ensino de línguas. / The aim of this work is to put forward a proposal for teaching typical adverbial collocations in Business English, based on Corpus Linguistics principles. To accomplish that, we compiled a monolingual corpus in British and American English, composed of business press texts as well as company reports available on-line, totalling 2,310,429 words. With the use of Wordsmith Tools 5.0 and statistical scores (T-score and Mutual Information), we found the twenty-five most frequent adverbial collocations, which were lexically and grammatically analysed concerning their patterns. Next, from these collocational units we selected the ones which could be approached in class, following the criteria set by the researcher, based on Tim Johns DDL (JOHNS, 1991), Carters modelling (CARTER, 1998) and the objections raised concerning the use of concordance lines for language teaching. Thus, we planned exercises to be applied in three pilot studies, carried out with Business English students at a private language school in São Paulo, Brazil, at the following levels: preintermediate, intermediate and upper-intermediate. The three studies demonstrated that such exercises can be successfully used in Business English classes thus confirming the relevance of Corpus Linguistics as an approach to language teaching. abordagem DDL colocações adverbiais convencionalidade ensino de línguas inglês para negócios Linguística de Corpus linhas de concordância adverbial collocations Business English concordance lines conventionality Corpus Linguistics DDL approach language teaching
3	Padrões de usos de pronomes átonos lexicalizados no espanhol: um estudo baseado na Linguística de corpus / Patterns of usage of lexicalized unstressed pronouns in Spanish: a study based on Corpus linguistics Serikaku, Helenice 13 June 2014 (has links) Made available in DSpace on 2016-04-28T18:22:52Z (GMT). No. of bitstreams: 1 Helenice Serikaku.pdf: 3162432 bytes, checksum: 31458c5c6f756832a75971e51e2f6664 (MD5) Previous issue date: 2014-06-13 / Brazilian learners of Spanish as a foreign language frequently have difficulties with unstressed pronouns usage. Their usage settings are different in both languages ─ in Spanish it tends to be marked in speech, and, in Portuguese, it tends to be omitted. The difficulty not only concerns using these pronouns with their canonic function of direct and indirect object, but also la, las, lo, and le forms without the mentioned function ─ known as lexicalized unstressed pronouns (PAL). This research aims to identify patterns of PAL as well verbs that co-occur with PAL according to Corpus Linguistics framework and its methodological resources (BERBER SARDINHA, 2000, 2004), within a lexicographic tradition (SINCLAIR, 2004). A corpus representative of Spanish general language, esTenTen 11 (KILGARRIFF, 2004), which contains almost ten billions words and it embraces European and American Spanish variants, was considered. Through the reading of concordance lines, firstly, verbs that co-occur with PAL were extracted. Then patterns of use of these verbs co-occurring with PAL were identified. It was noted that la, las, lo, and le have various performances, which makes it hard for these particles to be classified functionally. These ones might allude to an absent reference concurring with patterns of verb with present reference; influence verb meaning; provide verb with a pragmatic and expressive sense and it tends to appear naturally with the verb within the Idiomatic Principle / O aluno brasileiro aprendiz de Espanhol como Língua Estrangeira frequentemente enfrenta dificuldades com o uso dos pronomes átonos em espanhol já que, entre os dois idiomas, a configuração de uso desse grupo de partículas é distinta ─ enquanto no espanhol a tendência é seu uso marcado, prevalece a sua omissão no português do Brasil. Essa dificuldade não só concerne à utilização dessas formas dentro de suas funções canônicas de pronome objeto / complemento direto e indireto, mas também ao uso das formas la, las, lo e le sem essas funções ─ caso dos pronomes átonos lexicalizados (PAL), objeto desta pesquisa. O objetivo é identificar os padrões de uso dos PAL la, las, lo e le, assim como os principais verbos com os quais esses PAL coocorrem, segundo os princípios teóricos e recursos metodológicos adotados pela Linguística de Corpus (BERBER SARDINHA, 2000, 2004), notadamente pela sua tradição lexicográfica (SINCLAIR, 2004). Um corpus de estudo representativo da língua geral foi utilizado, o esTenTen 11 (KILGARRIFF, 2004). Ele abrange as variantes europeia e americanas do espanhol e possui quase 10 bilhões de palavras, etiquetadas morfossintaticamente. Através da leitura de linhas de concordância obtidas no corpus, tendo como foco as formas não canônicas dos pronomes átonos, foram extraídos os verbos que coocorrem com os PAL. A seguir, mais pesquisas foram realizadas sobre as linhas de concordância para identificar os principais padrões de uso dos PAL. Pode-se perceber que os PAL la, las, lo e le têm comportamentos variados, o que torna difícil fazer uma classificação funcional para cada uma dessas partículas. Estas podem aludir a um referente ausente, concorrendo com padrões de verbos com referente presente; influenciar na significação do verbo; dotar o verbo de um sentido pragmático-expressivo; e tender a aparecer junto do verbo por princípio idiomático do espanhol. Esta pesquisa pretendeu contribuir para uma área escassa de estudos, cuja importância é clara em face das necessidades pedagógicas do brasileiro aprendiz de ELE Pronomes átonos lexicalizados Linguística de corpus Linhas de concordância Lexicalized unstressed pronouns Corpus linguistics Patterns Concordance lines
4	Working closely with corpora. Proposta de ensino de colocações adverbiais em inglês para negócios, sob a luz da Linguística de Corpus / Working closely with corpora. A proposal for teaching typical adverbial collocations in Business English, based on Corpus Linguistics principles Santos, Andrea Geroldo dos 25 October 2011 (has links) O objetivo deste trabalho é apresentar uma proposta para o ensino de colocações adverbiais em inglês, típicas da área de negócios, à luz da Linguística de Corpus. Para isso, compilamos um corpus monolíngue em inglês britânico e americano, composto de periódicos de negócios e relatórios de empresas disponíveis on-line, num total de 2.310.143 palavras. Com a ajuda do Wordsmith Tools 5.0 e de cálculos estatísticos (escores T e Informação Mútua), levantamos as vinte e cinco colocações adverbiais mais recorrentes no corpus de estudo e as analisamos quanto aos padrões léxicogramaticais. Feita a análise, selecionamos dentre essas unidades colocacionais aquelas que poderiam ser abordadas em sala de aula, segundo os critérios estabelecidos pela pesquisadora, com base na abordagem DDL de Tim Johns (1991), na modelagem de Carter (1998) e nas críticas feitas ao uso das linhas de concordância no ensino de línguas. Elaboramos, assim, exercícios que foram aplicados em três estudos-piloto, realizados com alunos de inglês para negócios de um instituto de idiomas privado, nos seguintes níveis: pré-intermediário, intermediário e intermediário superior. Nos três estudos, comprovamos a aplicabilidade dos exercícios e a importância da Linguística de Corpus como abordagem ao ensino de línguas. / The aim of this work is to put forward a proposal for teaching typical adverbial collocations in Business English, based on Corpus Linguistics principles. To accomplish that, we compiled a monolingual corpus in British and American English, composed of business press texts as well as company reports available on-line, totalling 2,310,429 words. With the use of Wordsmith Tools 5.0 and statistical scores (T-score and Mutual Information), we found the twenty-five most frequent adverbial collocations, which were lexically and grammatically analysed concerning their patterns. Next, from these collocational units we selected the ones which could be approached in class, following the criteria set by the researcher, based on Tim Johns DDL (JOHNS, 1991), Carters modelling (CARTER, 1998) and the objections raised concerning the use of concordance lines for language teaching. Thus, we planned exercises to be applied in three pilot studies, carried out with Business English students at a private language school in São Paulo, Brazil, at the following levels: preintermediate, intermediate and upper-intermediate. The three studies demonstrated that such exercises can be successfully used in Business English classes thus confirming the relevance of Corpus Linguistics as an approach to language teaching. abordagem DDL adverbial collocations Business English colocações adverbiais concordance lines convencionalidade conventionality Corpus Linguistics DDL approach ensino de línguas inglês para negócios language teaching Linguística de Corpus linhas de concordância

1

Page generated in 0.0732 seconds