Spelling suggestions: "subject:"corpus linguistics"" "subject:"korpus linguistics""
1 |
Contrastive lexicology and comparable English-Arabic corpora-based analysis of vague and mistranslated Arabic equivalence : the case of the modern English-Arabic dictionary of al-MawridAlmujaiwel, Sultan Nasser January 2012 (has links)
The main concern in this research is to reveal the existence of shortcomings in the representation of meaning in the equivalents provided in a given context of the bilingual English-Arabic dictionary of al-Mawrid (Ba<albaki 2005), and to disclose the contributions made in Contrastive Lexicology, Bilingual Lexicography, Translation Theory, Corpus Linguistics and Contrastive Linguistics, in an attempt to come up with a more suitable framework, based on bilingual lexicology and corpora-based approaches, for the analysis of equivalence in English-Arabic by means of computerized corpora, especially by what is known as comparable corpora. This research is divided into 6 Chapters. The introduction, Chapter 1, provides the statement of the research problem, the rationale, the objectives and the questions of the study. Chapter 2 discusses three issues: (i) the terms used to refer to the word; (ii) the semantic analysis and relations of the word; and (iii) the disciplines of bilingual lexicography, translation studies and contrastive linguistics, and their respective contributions to the central notion of equivalence in the bilingual dictionary. The discussion about the last issue will pave the way for using comparable corpora in the investigation of selected entries and their equivalents in the given context. It will also show how useful and effective such an approach is in criticising existing Arabic equivalents in al-Mawrid (2005). Chapter 3 is a review of the bilingual English-Arabic dictionary of al-Mawrid in terms of its purpose and the representation of meanings and entries. It also includes an overview of previous reviews. The aim is to provide and develop a new critical framework of al-Mawrid by a new multi-approach to equivalence in the English-Arabic dictionary, as given in Chapter 4: this is mainly based on comparable English-Arabic corpora, and the criteria for making two individual corpora comparable rather than parallel. Chapters 5 and 6 are dedicated to the analysis of equivalents which are found to be either vague (see Chapter 5) or a mistranslation (see Chapter 6) in a given context.
|
2 |
Multi word units in a corpus-based study of Memoranda of Understanding : modal multi word unitsAwab, Su'ad January 1999 (has links)
No description available.
|
3 |
Alternative Complementation in Partially Schematic Constructions: a Quantitative Corpus-based Examination of COME to V2 and GET to V2Lester, Nicholas A. 05 1900 (has links)
This paper examines two English polyverbal constructions, COME to V2 and GET to V2, as exemplified in Examples 1 and 2, respectively. (1) the senator came to know thousands of his constituents (2) Little Johnny got to eat ice cream after every little league game. Previous studies considered these types of constructions (though come and get as used here have not been sufficiently studied) as belonging to a special class of complement constructions, in which the infinitive is regarded as instantiating a separate, subordinate predication from that of the “matrix” or leftward finite verb. These constructions, however, exhibit systematic deviation from the various criteria proposed in previous research. This study uses the American National Corpus to investigate the statistical propensities of the target phenomena via lexico-syntactic (collostructional analysis) and morpho-syntactic (binary logistic regression) features, as captured through the lens of construction grammar.
|
4 |
Laaste spore van Nederlands in Afrikaanse werkwoorde / J. KirstenKirsten, Johanita January 2013 (has links)
In the diachronic studies of Afrikaans in the past, the focus used to be on the origin and early development of Afrikaans from Dutch. During the twentieth century, the philological school, with a tradition of researching all Cape-Dutch coloured texts in detail, was established through the work of J. du P. Scholtz and his students. Through their analyses, they estimated the stabilisation of Afrikaans as early as the end of the eighteenth century (for example Raidt, 1991:145; Ponelis, 1994:229). In the past few decades, however, this estimation has begun to receive criticism from other scholars, including Roberge (1994:159) and Deumert (2004:20). With the help of a corpus, Deumert (2004) has shown that there is substantial variation in Afrikaans letters as late as the early twentieth century, and this study expands on her work by researching the variation in published writing. This is done by focusing on verbs, as there is significant change from the Dutch verbal system to the Afrikaans verbal system. This study uses corpus linguistic research methods, and researches Dutch-Afrikaans variation in verbs in published Afrikaans texts, compiled in three corpora. The main corpus was compiled from all the Afrikaans writings of Totius (J.D. du Toit) in the publication Het Kerkblad from 1916 to 1922. Two control corpora are also used: the first was compiled from excerpts from published Afrikaans books for the same period, and the second was compiled from excerpts from Afrikaans periodicals for the same period. In order to compensate for the shortcomings of corpus data alone, normative works on Afrikaans from the relevant period are also taken into account, and there is shown which recommendations these works made about the relevant constructions, and how the corpus data correlates with these recommendations. Variation in six verbal constructions are analysed in this study: 1. End consonant t/n (for example gaat/gaan): the old (more Dutch) word forms are scarcely used in the corpora, while the modern Afrikaans word forms are almost fully established. 2. End consonant g (for example seg/sê): the old word forms are also scarcely used in the corpora, while the modern word forms take the lead. 3. Stem vowel (for example breng/bring): the old word forms are more frequent at the beginning of the period, followed by some uncertainty, with the modern word forms taking over by the end of the period. 4. Preterite (specifically had/gehad and werd/geword): there is great instability throughout, worsened by a distinction in use between main verbs and auxiliary verbs made by some authors. 5. Past participle (for example gedaan/gedoen): there is significant instability at the beginning of the period, but the modern word forms are used more frequently by the end of the period. 6. Perfect tense auxiliary verb (is/het): the old form is still used in the corpora, but the modern form is more frequent from the beginning, and becomes even more frequent towards the end. This data shows that there was still significant variation in Afrikaans under Dutch influence as late as the early twentieth century, and the correlation between the different corpora implies that the written language might have been much closer to the spoken language than had been previously assumed. It is further confirmed by the amount of attention this variation gets in the normative works from that period. / Thesis (M.A. (Afrikaans and Dutch))--North-West University, Vaal Triangle Campus, 2013
|
5 |
Getting All the Ducks in a Row: Towards a Method for the Consolidation of English IdiomsLynn, Ethan Michael 01 June 2016 (has links)
Idioms play an important role in language acquisition but learners do not have sufficient time to learn all of them. Therefore, learners need to focus on the most frequently occurring idioms, which can be determined by corpus searches. Building off previous corpus studies, this study generated a comprehensive list of English idioms by combining lists from several sources and developed a methodology for organizing and sorting idioms within the list. In total, over 27,000 idiom forms were amalgamated and a portion of the list was compiled, which featured 2,697 core idioms and 5,559 variant idiom forms. It was found that over 35% of idioms varied structurally and thirteen types of idiom variation were highlighted. Additionally, issues concerning idiom boundaries were investigated. These results are congruent with previous findings which show that variation is a commonly occurring element of idioms. Furthermore, specific problematic elements for future corpus searches and English language learners are identified.
|
6 |
Laaste spore van Nederlands in Afrikaanse werkwoorde / J. KirstenKirsten, Johanita January 2013 (has links)
In the diachronic studies of Afrikaans in the past, the focus used to be on the origin and early development of Afrikaans from Dutch. During the twentieth century, the philological school, with a tradition of researching all Cape-Dutch coloured texts in detail, was established through the work of J. du P. Scholtz and his students. Through their analyses, they estimated the stabilisation of Afrikaans as early as the end of the eighteenth century (for example Raidt, 1991:145; Ponelis, 1994:229). In the past few decades, however, this estimation has begun to receive criticism from other scholars, including Roberge (1994:159) and Deumert (2004:20). With the help of a corpus, Deumert (2004) has shown that there is substantial variation in Afrikaans letters as late as the early twentieth century, and this study expands on her work by researching the variation in published writing. This is done by focusing on verbs, as there is significant change from the Dutch verbal system to the Afrikaans verbal system. This study uses corpus linguistic research methods, and researches Dutch-Afrikaans variation in verbs in published Afrikaans texts, compiled in three corpora. The main corpus was compiled from all the Afrikaans writings of Totius (J.D. du Toit) in the publication Het Kerkblad from 1916 to 1922. Two control corpora are also used: the first was compiled from excerpts from published Afrikaans books for the same period, and the second was compiled from excerpts from Afrikaans periodicals for the same period. In order to compensate for the shortcomings of corpus data alone, normative works on Afrikaans from the relevant period are also taken into account, and there is shown which recommendations these works made about the relevant constructions, and how the corpus data correlates with these recommendations. Variation in six verbal constructions are analysed in this study: 1. End consonant t/n (for example gaat/gaan): the old (more Dutch) word forms are scarcely used in the corpora, while the modern Afrikaans word forms are almost fully established. 2. End consonant g (for example seg/sê): the old word forms are also scarcely used in the corpora, while the modern word forms take the lead. 3. Stem vowel (for example breng/bring): the old word forms are more frequent at the beginning of the period, followed by some uncertainty, with the modern word forms taking over by the end of the period. 4. Preterite (specifically had/gehad and werd/geword): there is great instability throughout, worsened by a distinction in use between main verbs and auxiliary verbs made by some authors. 5. Past participle (for example gedaan/gedoen): there is significant instability at the beginning of the period, but the modern word forms are used more frequently by the end of the period. 6. Perfect tense auxiliary verb (is/het): the old form is still used in the corpora, but the modern form is more frequent from the beginning, and becomes even more frequent towards the end. This data shows that there was still significant variation in Afrikaans under Dutch influence as late as the early twentieth century, and the correlation between the different corpora implies that the written language might have been much closer to the spoken language than had been previously assumed. It is further confirmed by the amount of attention this variation gets in the normative works from that period. / Thesis (M.A. (Afrikaans and Dutch))--North-West University, Vaal Triangle Campus, 2013
|
7 |
From hypotaxis to parataxis : an investigation of English-German syntactic convergence in translationBisiada, Mario January 2013 (has links)
Guided by the hypothesis that translation is a language contact situation that can influence language change, this study investigates a frequency shift from hypotactic to paratactic constructions in concessive and causal clauses in German management and business writing. The influence of the English SVO word order is assumed to cause language users of German to prefer verb-second, paratactic constructions to verb-final, hypotactic ones. The hypothesis is tested using a 1 million word diachronic corpus containing German translations and their source texts as well as a corpus of German non-translations. The texts date from 1982–3 and 2008, which allows a diachronic analysis of changes in the way English causal and concessive structures have been translated. The analysis shows that in the translations, parataxis is indeed becoming more frequent at the expense of hypotaxis, a phenomenon that, to some extent, also occurs in the non-translations. Based on a corpus of unedited draft translations, it can be shown that translators rather than editors are responsible for this shift. Most of the evidence, however, suggests that the shift towards parataxis is not predominantly caused by language contact with English. Instead, there seems to be a development towards syntactically simpler constructions in this genre, which is most evident in the strong tendency towards sentence-splitting and an increased use of sentence-initial conjunctions in translations and non-translations. This simplification seems to be compensated for, to some extent, by the establishment of pragmatic distinctions between specific causal and concessive conjunctions.
|
8 |
When if is when and when is then: The particle nı̨dè in Tłı̨chǫAnisman, Adar 11 December 2019 (has links)
The purpose of this dissertation is to account for the syntactic and semantic traits of Tłı̨chǫ modal clauses within a cross-linguistic typology of conditional clauses.
This dissertation provides a comprehensive description and analysis of clauses that are introduced by nı̨dè, a Tłı̨chǫ word with cognates in many Dene languages. Clauses that are introduced by nı̨dè are modal adjuncts, which cover predictions about the future (future temporal adverbial clauses, when), hypothetical scenarios (conditional clauses, if) and generic or habitual generalisations about the world (restrictive clauses, whenever).
I provide a unified account for all of these uses by showing that they are all in the realm of modality. I then hypothesise that nı̨dè is a complementiser which introduces a modal adjunct clause. I follow von Fintel (2006) and Kratzer (2012) and suggest that nı̨dè restricts a modal operator in its apodosis. This account explains apparent gaps in the Tłı̨chǫ grammar, and in particular within concessive adjunct clauses ('even though...'), which cannot be introduced by nı̨de, and attributes this mismatch to the difference between the factivity of concessive adjunct clauses on the one hand and modality in clauses introduced by nı̨dè on the other hand. I contrast this with concessive conditional clauses ('if.... even...'), which can be introduced by nı̨dè, and in which nı̨dè scopes over the concessive adverb kò (following Bennett, 1982, 2003).
This work highlights the ways in which Tłı̨chǫ conditionals are different from, and similar to, previous cross-linguistic generalisations of conditionals. Conditionals in Tłı̨chǫ and other Dene languages differ from many accounts of conditionals, which focus on the role of the verbal form in communicating speaker attitudes about the hypotheticality of the proposition in the conditional (Iatridou, 2000; Karawani, 2014). In contrast, Tłı̨chǫ uses verb aspect inside clauses to indicate the action as complete or incomplete, much like in matrix clauses. Tłı̨chǫ speakers communicate their attitudes of the likelihood and hypotheticality of the proposition using other means, such as adverbs and evidentials.
However, Tłı̨chǫ is also similar to other languages, in extending the modal nature of conditional clauses to a subtype of conditionals called premise conditionals, which communicate rhetorical devices and a variety of metatextual comments (Dancygier, 1993, 1999). This is unexpected, as I argue that nı̨dè must introduce a modal clause, whereas premise conditionals seemingly deal with facts. I argue that despite first impressions, Tłı̨chǫ premise conditionals are still within the realm of modality, as they are either used to express propositions that are not accepted as fact by the speaker, or are used to restrict a modal in the adjoined clause, much like hypothetical conditionals. The structure of Tłı̨chǫ premise conditionals is likewise similar to the structure that has been proposed in the past for other languages (Haegeman, 2003, 2010). / Graduate
|
9 |
Iconic Semantics in Phonology: A Corpus Study of Japanese MimeticsCaldwell, Joshua Marrinor 29 November 2010 (has links) (PDF)
Recent research on Japanese mimetics examines which part of speech the mimetic occurs as. An individual mimetic can appear as a noun, an adjective, an adverb, or a verb (Tsujimura & Deguchi 2007, 340). It is assumed by many scholars that mimetic words essentially function as adverbs (Inose 2007, 98). Few data-based studies exist that quantify the relative frequency of mimetic words in different word categories. Akita (2009) and Caldwell (2009a) have performed small scale or preliminary studies of this aspect of Japanese mimetics. The use of mimetics in other grammatical function categories has been attributed to the polysemous nature of Japanese mimetics (Key 1997). The common explanation is that the flexibility of mimetics is probably due to their iconicity (Sugiyama 2005, 307; Akita 2009; among others). Yet the definition of "iconicity" is often incomplete or cursory in nature. Newmeyer, Nuckolls, Kohn, and Key all accept or suggest the philosophies of C.S. Peirce as a possible explanation or source for understanding the iconicity of mimetic words. The purpose of this thesis is twofold: first, examine the prominent semantic theories regarding Japanese mimetics and show how the philosophies of Peirce can add clarity; second, examine overall occurrence of 1700+ mimetics per parts of speech using the data from the Kotonoha (http://www.kotonoha.gr.jp) and JpWaC (http://corpus.leeds.ac.uk/) Corpora. Peirce identified three distinct icon types: icons of abstract quality (1-1-1), icons of physical instantiation (1-1-2), and icons of abstract relation (1-1-3). These three types correspond to three distinct types of mimetic word: phonomimes (abstract sound qualities), typically predicate modifiers, phenomimes (physical actions), more often nouns or noun modifiers, and psychomimes, (relational), more often verbs or parts of verbs. Corpus data validates the observation that mimetics are usually functioning as predicate modifiers, but also supports Akita's hypothesis that psychomimes are incorporated into verbs more readily than other mimetics, which in turn is explained by the Peircean analysis.
|
10 |
Understanding the Effect of Formulaic Language on ESL Teachers' Perceptions of Advanced L2 Writing: An Application of Corpus-Identified Formulaic LanguageYoungblood, Alison 01 January 2014 (has links)
A quantitative study was conducted to determine if the amount of formulaic language influenced ESL teachers' perceptions (n=102) of non-native writing skill, as evidenced by composite and sub-scale scores on the ESL Composition Profile (Jacobs et al., 1981). Formulaic language was operationalized as 25 three-word strings sampled from the writing sub-list of the Academic Formulas List (Simpson-Vlach & Ellis, 2010) and further validated as frequent in the Michigan Corpus of Upper Level Student Papers. The target formulaic sequences were divided into three experimental groups representing a low, mid, and high amount of formulaic language. Four advanced non-native writers generated argumentative, timed writing samples that incorporated the target sequences. The writing samples were then assembled into data collection packets and distributed at eight Intensive English Programs across the southeastern United States. A repeated measures ANOVA indicated that there was a significant difference in composite score (p < .05) between the control and three experimental conditions; however, the essays that incorporated 16 and 25 formulaic sequences scored significantly lower than those with zero or eight target sequences. When the amount of syntactical and semantic errors were strictly controlled for, the composite scores also fell between the control and experimental conditions, but the decrease in score was not significant (p > .05). The content, organization, vocabulary, language, and mechanics sub-scales were also compared using a repeated measures MANOVA. In content, organization, and language, the control and low essays outscored the mid and high conditions (p < .05). For the vocabulary sub-scale, the control and low condition were not significantly different, but the control essays only outperformed the mid level essays. The low essays outperformed both the mid and high essays. In terms of mechanics, there was only a significant difference between the low and mid level essays. The results of the MANOVA were consistent when the amount of syntactic and semantic errors were controlled. Implications for teaching suggest that the Academic Formulas List would not benefit academically-oriented L2 learners preparing to enter a university. While corpus tools are valuable in helping teachers, material writers, and publishers improve vocabulary instruction in the English classroom, not all statistically salient lexical combinations are important for non-native writers to master and incorporate in their academic discourse.
|
Page generated in 0.1918 seconds