Global ETD Search

111	Corpus Linguistics and Cultural Difference in Canada Fee, Margery January 2005 (has links) A brief account of the work of the Strathy Language Unit (Queen's University)to produce a corpus suitable for supporting the publication of Guide to Canadian English Usage (Oxford 1997, 2nd ed. 2007) Canadian lexicography linguistic nationalism corpus linguistics digital resources for Canadian English
112	Genre Features of Personal Statements by Chinese English-as-an-Additional-Language Writers: A Corpus-Driven Study Chen, Sibo 07 May 2013 (has links) Personal Statements (PSs) are self-narrative essays written for Western graduate school applications, which serve an important role in Western graduate schools’ admission processes. However, genre features of PSs have not been sufficiently addressed by previous genre studies. Such neglect indicates a promising area for investigation as the increasing number of non-native English speakers in Western higher education systems creates an urgent pedagogical need for PS-related English-as-an-Additional-Language (EAL) instruction. The present thesis reports a corpus-driven genre analysis of PSs written by Chinese EAL students (CEAL-PSs). Based on a corpus of 120 CEAL-PS samples, genre features of CEAL-PSs were investigated from three perspectives: (1) linguistic complexity (i.e. lexical diversity and grammatical intricacy), (2) content foci (i.e. at the lexical, phrasal, discoursal levels), and (3) functional move structure. In addition, comparative analyses were made between unedited and edited CEAL-PSs for investigating whether the editing process significantly changed the unedited CEAL-PSs in the above three perspectives. There were three major findings of the current study. First, the majority of lexicons used by the collected CEAL-PSs were frequent academic lexicons and the average grammatical intricacy of these samples was at senior high school or junior college levels. Second, expressions of self-promotion and discussions of academic/professional achievements were explicitly emphasized in the collected CEAL-PSs at the lexical, phrasal, and discoursal levels. Third, an IERC model (“Introduction,” “Establishing Credentials,” “Reasons for Application,” and “Conclusion”), was found to be followed by the majority of the collected CEAL-PSs. Based on the above findings, the thesis further discusses the current study’s theoretical, methodological, and pedagogical implications for EAL writing instruction in China. / Graduate / 0290 / 0681 / chensibo@uvic.ca English as an Additional Language Genre Personal Statement Discourse Analysis Corpus Linguistics
113	Undergraduate Student Writing Across the Disciplines: Multi-Dimensional Analysis Studies Hardy, Jack 18 December 2014 (has links) This dissertation uncovers and examines linguistic and functional patterns of student writing in the first two years of college. A corpus of student papers from six disciplines (philosophy, English, psychology, biology, chemistry, and physics) was collected, and multi-dimensional (MD) analysis (Biber, 1988) was used to examine the ways that discipline and paper type influence writing. Further explorations of the data compare lower-level student writing to upper-level student writing, professional academic biology writing, and the discipline-specific approximations of an English for Specific Purposes (ESP) course. Findings show that specificity of both linguistic and functional properties exist even at such low levels of disciplinary acculturation. These studies are followed by a summary and contextualization of their findings. Finally, future inquiry using collected data and future investigations into student literacy practices are proposed. Student writing Corpus linguistics Disciplinary specificity Multi-dimensional analysis Academic writing English for Specific Purposes
114	Keeping it in the family : disentangling contact and inheritance in closely related languages Colleran, Rebecca Anne Bills January 2017 (has links) The striking similarities between Old English (OE) and its neighbour Old Frisian (OFris)—including aspects of phonology, morphology, and alliterative phrases—have long been cause for comment, and often for controversy. The question of whether the resemblance was caused by an immediate common ancestor (Anglo-Frisian) or by neighboring positions in a dialect continuum/Sprachkreis has been hotly disputed using phonological and toponymic evidence, but not in recent years. Consensus in the nineties fell in favour of the dialect continuum, and there the issue has largely rested. However, recent finds in archaeology, history, and genetics argue that the case requires a second look. Developments in grammaticalization theory and contact linguistics give us new tools with which to investigate. Are the similarities between OE and OFris due to an exclusive shared ancestor, or are those languages merely part of a dialect continuum, with no closer relationship than that shared with the other early West Germanic dialects? And are there any reliable criteria to separate out inheritance-based similarities from those that are spread by contact? Shared developments seem, primo facie, to be evidence of shared inheritance, but there are other possible explanations. Parallel drift after separation, convergent development, or coincidence might be the cause of any shared feature. In this paper, I discuss recently proposed methods of distinguishing inheritance from drift and contact, focusing on how morphosyntax can help explore the shared history of OE and OFris. While grammaticalization processes often lead to cross-linguistic similarities, the fact that OE and OFris display a cluster of grammaticalizations not found in other early West Germanic dialects may be significant. The exclusive developments under investigation include aga(n) ‘have’ > ‘have to’ and the present participle as verbal complement. By comparing the forms, meanings, and distribution of these grammaticalized forms in the OFris corpus to that of their cognate forms in OE, I show that the two languages probably diverged from one another substantially later than they diverged from Old Saxon and Old Low Franconian.
115	Anglo-Saxon medicine and disease : a semantic approach Doyle, Conan Turlough January 2017 (has links) As a semantic investigation into Anglo-Saxon medicine, this thesis investigates the ways in which the Old English language was adapted to the technical discipline of medicine, with an emphasis on semantic interference between Latin medical terminology and Old English medical terminology. The main purpose of the examination is to determine the extent to which scholarly ideas concerning the nature of the human body and the causes of disease were preserved between the Latin texts and the English texts which were translated and compiled from them. The main way in which this has been carried out is through a comparative analysis of technical vocabulary, excluding botanical terms, in medical prose texts utilising the Dictionary of Old English Web Corpus of texts, and a selection of printed editions of Latin texts which seem to have been the most likely sources of medical knowledge in Anglo-Saxon England. As a prerequisite to this comparative methodology it has been necessary to assemble a corpus of Latin textual parallels to the single most significant Old English medical text extant, namely Bald’s Leechbook. These parallels have been presented in an appendix alongside a transcript and translation of Bald’s Leechbook. A single question thus lies at the heart of this thesis: did Old English medical texts preserve any of the classical medical theories of late antiquity? In answering this question, a number of other significant findings have come to light. Most importantly, it is to be noted that modern scholarship is only now beginning to focus on the range of Late Antique and Byzantine medical texts available in Latin translation in the early medieval period, most notably for our present purposes Alexander of Tralles, but also Oribasius, Galen, pseudo-Galen and several Latin recensions of the works of Soranus of Ephesus, including the so-called Liber Esculapii and Liber Aurelii. The linguistic study further demonstrates that the technical language of these texts was very well understood and closely studied in Anglo-Saxon England, the vernacular material not only providing excellent readings of abstruse Latin technical vocabulary, but also demonstrating a substantial knowledge of technical terms of Greek origin which survive in the Latin texts.
116	Collocates of trans, transgender(s) and transexual(s) in British Newspapers: A Corpus-Assisted Critical Discourse Analysis Törmä, Kajsa January 2018 (has links) Through their coverage in the mass media transgender people and the trans rights movement have only recently stepped into the public eye. Because this emergence is so recent, it has not been widely studied within the field of linguistics. This thesis aims to explore the representation of transgender people in newspapers using an approach informed by corpus linguistics and critical discourse analysis. Using collocation and concordance line analysis it identifies and discusses what semantic prosodies exist surrounding transgender people in The Daily Mail and The Guardian during 2015–2017. Many different semantic prosodies were found, and most of them were neither clearly negative nor positive towards transgender people. The prosodies were found to sometimes overlap and reinforce each other, and dominant news stories surrounding transgender people seemed to have great staying power. The overall conclusion is that transgender language in newspapers is still in its formative years and that additional research in this field is necessary. / <p>2018-08-21</p> CDA Corpus Linguistics Newspaper Discourse Semantic Prosodies Transgender Humanities and the Arts Humaniora och konst
117	Verblexpor : um recurso léxico com anotação de papéis semânticos para o português Zilio, Leonardo January 2015 (has links) Esta tese propõe um recurso léxico de verbos com anotação de papéis semânticos, denominado VerbLexPor, baseado em recursos como VerbNet, PropBank e FrameNet. As bases teóricas da proposta são interdisciplinares e retiradas da Linguística de Corpus e do Processamento de Linguagem Natural (PLN), visando-se a contribuir para a Linguística e para a Computação. As hipóteses de pesquisa são: a) um mesmo conjunto de papéis semânticos pode ser aplicado a diferentes gêneros textuais; e b) as diferenças entre esses gêneros se destacam no ranqueamento dos papéis semânticos. O desenvolvimento do VerbLexPor se apoia em dois corpora: um especializado, com mais de 1,6 milhão de palavras, composto por artigos científicos de Cardiologia de três periódicos brasileiros; e um não especializado, com mais de 1 milhão de palavras composto por artigos do jornal popular Diário Gaúcho. Os corpora foram anotados com o parser PALAVRAS, e as informações de sentenças, verbos e argumentos foram extraídas e armazenadas em um banco de dados. O VerbLexPor tem 192 verbos e mais de 15 mil argumentos anotados distribuídos em mais de 6 mil sentenças. Observou-se que o corpus do Diário Gaúcho privilegia uma sintaxe direta e pouco uso de voz passiva e adjuntos, enquanto o corpus de Cardiologia apresenta mais voz passiva e um maior uso de INSTRUMENTOS na posição de sujeito, além de uma menor incidência de AGENTES. Foram realizados também alguns experimentos paralelos, como a anotação de papéis semânticos por vários anotadores e o agrupamento automático de verbos. Na tarefa de múltiplos anotadores, cada um anotou exatamente as mesmas 25 orações. Os anotadores receberam um manual de anotação e um treinamento básico (explicação sobre a tarefa e dois exemplos de anotação). Usou-se o cálculo de multi-π para avaliar a concordância entre os anotadores, e o resultado foi de π = 0,25. Os motivos para essa concordância baixa podem estar na falta de um treinamento mais completo. A tarefa de agrupamento de verbos mostrou que a sintaxe e a semântica são igualmente importantes para o agrupamento. Este estudo contribui para a área de Linguística, com um léxico de verbos anotados semanticamente, e também para a Computação, com dados que podem ser consultados e processados para diversas aplicações do PLN, principalmente por estarem disponíveis nos formatos XML e SQL. / This dissertation aims at developing a lexical resource of verbs annotated with semantic roles, called VerbLexPor, and based on other resources, such as VerbNet, PropBank, and FrameNet. The theoretical bases of this study lies in Corpus Linguistics and Natural Language Processing (NLP), so that it aims at contributing to both Linguistics and Computer Science. The hypotheses are: a) one set of semantic roles can be applied to different genres; and b) the differences among genres are shown by the ranking of semantic roles. The development of VerbLexPor has two corpora at the basis: a specialized one, with more than 1.6 million words, composed by scientific papers in the field of Cardiology from three Brazilian journals; and a non-specialized one, with more than 1 million words, composed by newspaper articles from Diário Gaúcho. The corpora were analyzed with the parser PALAVRAS, and sentence, verb and argument information was extracted and stored in a database. VerbLexPor has 192 verbs and more than 15 thousand arguments annotated with semantic roles, distributed among more than 6 thousand sentences. We observed that Diário Gaúcho has a more direct syntax, with less passive voice and adjuncts, while Cardiology has more passive voice and more INSTRUMENTS for subjects, and fewer AGENTS. We also conducted some parallel experiments, such as semantic role labeling with multiple annotators and automatic verbal clustering. In the multiple annotators task, each of them annotated exactly the same 25 sentences. They received an annotation manual and basic training (explanation on the task and two annotation examples). We used multi-π to evaluate agreement among annotators, and results were π = 0,25. Reasons for this low agreement may be a lack of a thoroughly developed training. The verbal clustering task showed that syntax and semantics are equally important for verbal clustering. This study contributes to Linguistics, with a verbal lexicon annotated with semantic roles, and also to Computer Science, with data that can be assessed and processed for various NLP applications, especially because the data are available in both XML and SQL formats. Língua portuguesa Linguística computacional Corpus Linguagem especializada Semantic role labeling Lexical resource NLP Corpus linguistics
118	Irish English modal verbs from the fourteenth to the twentieth centuries Van Hattum, Marije January 2012 (has links) The thesis provides a corpus-based study of the development of Irish English modal verbs from the fourteenth to the twentieth centuries in comparison to mainland English. More precisely, it explores the morpho-syntax of CAN, MAY, MUST, SHALL and WILL and the semantics of BE ABLE TO, CAN, MAY and MUST in the two varieties. The data of my study focuses on the Kildare poems, i.e. fourteenth-century Irish English religious poetry, and a self-compiled corpus consisting of personal letters, largely emigrant letters, and trial proceedings from the late seventeenth to the twentieth centuries. The analysis of the fourteenth and nineteenth centuries is further compared to a similar corpus of English English. The findings are discussed in the light of processes associated with contact-induced language change, new-dialect formation and supraregionalization. Contact-induced language change in general, and new-dialect formation in particular, can account for the findings of the fourteenth century. The semantics of the Irish English modal verbs in this century were mainly conservative in comparison to English English. The Irish English morpho-syntax showed an amalgam of features from different dialects of Middle English in addition to some forms which seem to be unique to Irish English. The Irish English poems recorded a high number of variants per function in comparison to a selection of English English religious poems, which does not conform to predictions based on the model of new-dialect formation. I suggest that this might be due to the fact that the English language had not been standardized by the time it was introduced to Ireland, and thus the need to reduce the number of variants was not as great as it is suggested to be in the post-standardization scenarios on which the model is based. In seventeenth- and eighteenth-century Ireland, increased Irish/English bilingualism caused the formation of a second-language (L2) variety of English. In the nineteenth century the bilingual speakers massively abandoned the Irish language and integrated into the English-speaking community. As a result, the varieties of English as spoken by the bilingual speakers and as spoken by the monolingual English speakers blended and formed a new variety altogether. The use of modal verbs in this new variety of Irish English shows signs of colonial lag (e.g. in the development of a deontic possibility meaning for CAN). Additionally, the subtle differences between BE ABLE TO and CAN in participant-internal possibility contexts and between epistemic MAY and MIGHT in present time contexts were not fully acquired by the L2 speakers, which resulted in a higher variability between the variants in the new variety of Irish English. In the late nineteenth and early twentieth centuries the use of modal verbs converged on the patterns found in English English, either as a result of linguistic accommodation in the case of informants who had migrated to countries such as Australia and the United States, or as a result of supraregionalization in the case of those who remained in Ireland. 425
119	Corpop : um corpus de referência do português popular escrito do Brasil Pasqualini, Bianca Franco January 2018 (has links) Esta tese propõe um corpus do Português popular brasileiro escrito, denominado CorPop, com textos selecionados com base no nível de letramento médio dos leitores do país. As bases teórico-metodológicas do CorPop são interdisciplinares e inserem-se no âmbito dos Estudos da Linguagem e disciplinas afins, como Estudos do Léxico e Linguística de Corpus, Linguística Textual e Psicolinguística, dialogando também com estudos de Processamento de Língua Natural. Desse modo, esta investigação abriga-se na Linha de Pesquisa Lexicografia, Terminologia e Tradução: Relações Textuais do PPG-Letras-UFRGS, e nosso recorte, por isso, tende ao destaque para o Léxico. O desenvolvimento do CorPop deu-se através da compilação de dados sobre o nível de letramento dos leitores brasileiros e das características que poderiam compor um padrão de simplicidade textual em um corpus de textos adequados a esses leitores. Tais dados foram coletados das pesquisas do Indicador de Alfabetismo Funcional (INAF) e Retratos da Leitura no Brasil, além de um questionário com leitores. Os textos selecionados para o CorPop são (1) textos do jornalismo popular do Projeto PorPopular (jornal Diário Gaúcho), consumido maciçamente pelas classes C e D, que é o leitor médio brasileiro; (2) textos e autores mais lidos pelos respondentes das últimas edições da pesquisa Retratos da Leitura no Brasil; (3) coleção “É Só o Começo” (adaptação de clássicos da literatura brasileira para leitores com baixo letramento, adaptação esta realizada por linguistas); (4) textos do jornal Boca de Rua, produzido por pessoas em situação de rua, com baixa escolaridade e baixo letramento; e (5) textos do Diário da Causa Operária, imprensa operária brasileira produzida também por pessoas dentro da faixa média de letramento do país. Realizamos, após a coleta, preparação e processamento dos textos do corpus, uma série de experimentos com a lista bruta de frequências e com a lista de frequências lematizada do CorPop. Os resultados obtidos mostram aplicações promissoras do CorPop em diversas tarefas linguísticas, desde simplificação de textos até uso como vocabulário controlado para redação de paráfrases definitórias em dicionários e comprovam que um corpus pequeno pode ter a mesma validade que um corpus de grandes proporções. / This thesis proposes a corpus of Brazilian popular Portuguese written, called CorPop, with texts selected based on the average level of literacy of the country 's readers. CorPop's theoretical and methodological bases are interdisciplinary and fall within the scope of Language Studies and related disciplines, such as Corpus Lexicon and Linguistics Studies, Textual Linguistics and Psycholinguistics, and also dialogues with Natural Language Processing studies. Thus, this research is housed in the Lexicography, Terminology and Translation Research Line: Textual Relations of PPG-Letras-UFRGS, and our cut, therefore, tends to highlight the Lexicon. The development of CorPop took place through the compilation of data about the level of literacy of Brazilian readers and the characteristics that could compose a standard of textual simplicity in a corpus of texts suitable for these readers. These data were collected from the surveys of the Indicator of Functional Literacy (INAF) and Reading Portraits in Brazil, as well as a questionnaire with readers. The texts selected for CorPop are (1) texts of the popular journalism of the PorPopular Project (newspaper Diário Gaúcho), massively consumed by the C and D classes, which is the average Brazilian reader; (2) texts and authors most read by the respondents of the last editions of the research Retratos da Leitura no Brasil; (3) collection "É Só o Começo" (adaptation of classics from Brazilian literature to readers with low literacy, adaptation by linguists); (4) texts of the newspaper Boca de Rua, produced by street people, with low schooling and low literacy; and (5) texts of the Diário da Causa Operária, the Brazilian working press produced also by people within the average literacy range of the country. After the collection, preparation and processing of the texts of the corpus, a series of experiments with the crude list of frequencies and the list of frequencies typed in CorPop. The results obtained show promising applications of CorPop in several linguistic tasks, such as text simplification and use as controlled vocabulary for writing definitions in dictionaries. Also, CorPop proves that a small corpus can have the same validity as a corpus of large proportions. Língua portuguesa Leitura : Compreensão Lingüística de corpus Corpus of popular Brazilian Portuguese Corpus linguistics Text simplification
120	Análise de um corpus de produção escrita em português por crianças e adultos indígenas bilíngues/monolíngues de Dourados/MS a partir da linguistíca de corpus Espindola, Sandra January 2014 (has links) Com a finalidade de entender a origem das dificuldades apresentadas por crianças e adultos indígenas na produção de textos em português, surgiu a presente investigação. a partir da Linguística de Corpus. Para tanto, foi construído um corpus de 483 textos de crianças e 349 textos de adultos escritos em língua portuguesaproduzidos por crianças e adultos indígenas e não indígenas. A amostra do grupo das crianças contou um total de 175 crianças, sendo 111 indígenas (71 bilíngues Guarani/Kaiowá e 40 Terena monolíngues) e 64 não indígenas, falantes monolíngues de português, alunos do 4º e do 5º ano do Ensino Fundamental. O grupo de adultos foi formado por um total de 118 adultos, sendo 74 indígenas (36 bilíngues Guarani/Kaiowá e 38 Terena monolíngues) e 44 não indígenas, falantes monolíngues de português, do1o e do último ano do Ensino Superior. Os objetivos específicos da pesquisa foram: (a) verificar se existem diferenças entre o tipo de dificuldades reveladas pelos indígenas monolíngues e bilíngues de diferentes etnias – Kaiowá/Guarani e Terena – em comparação com os monolíngues não indígenas na produção de textos narrativos em português; (b) na comparação entre os dois grupos etários, crianças e adultos, observar em que medida o caminho percorrido do ensino básico à formação acadêmica interferiu no desenvolvimento da habilidade de escrita de textos; e (c) no caso dos grupos de participantes adultos, investigar se o tempo de permanência no curso de graduação (alunos que estão no primeiro e no quarto ano de curso) interfere no nível de dificuldade na produção de textos. Os dados foram analisados através da ferramenta AntConc, a partir do viés teórico da Linguística de Corpus. A partir dessa proposta de pesquisa espera-se contribuir para que os professores, tanto os que atendem os acadêmicos quanto os que atendem as crianças, compreendam como a escrita desses dois grupos indígenas se estrutura. Essas informações são essenciais para futuras orientações nos trabalhos de leitura e escritas propostos pela escola e pelos cursos universitários que recebem acadêmicos indígenas. / In order to underste the origin of the difficulties faced by indigenous children e adults in the production of texts in Portuguese, this research emerged, from Corpus Linguistics. To that end, was built a corpus of 483 children e 349 adults texts of texts written in Portuguese produced by children e indigenous e non-indigenous adults.The sample of children group counted a total of 175 children, with 111 indigenous (71 bilingual Guarani / Kaiowá e Terena 40 monolingual) e 64 non-indigenous, monolingual speakers of Portuguese, students of the 4th e 5th year of elementary school.The adult group consisted of a total of 118 adults, with 74 indigenous (36 bilingual Guarani / Kaiowá e Terena 38 monolingual) e 44 non-indigenous, monolingual speakers of Portuguese, the first e last years of higher education.The specific objectives of the research were: (a) determine whether there are differences between the kinds of problems revealed by monolingual e bilingual indigenous ethnic groups - Kaiowá / Guarani e Terena - compared to non-indigenous monolingual in the production of narrative texts in Portuguese;(b) the comparison between the two age groups, children e adults, to observe to what extent the traveled way of basic education to academic interfered in the development of written texts skill;e (c) in the case of adults participating groups, to investigate whether the time spent in the undergraduate course (students who are the first e fourth year of course) interferes with the level of difficulty in producing texts.Data were analyzed by AntConc tool from the theoretical bias of Corpus Linguistics. From this research proposal is expected to contribute to teachers, both those who meet the academic e the attending children, underste how the writing of these two indigenous groups structure.This information is essential for future guidance in reading e written work proposed by schools e university courses receiving indigenous academics. Escrita Língua portuguesa Ensino e aprendizagem Educação indígena Produção textual Language education Teaching indigenous Corpus linguistics

Search results