Spelling suggestions: "subject:"computational linguistics -- dethodology"" "subject:"computational linguistics -- methododology""
1 |
Towards a corpus of Indian South African English (ISAE) : an investigation of lexical and syntactic features in a spoken corpus of contemporary ISAEPienaar, Cheryl Leelavathie January 2008 (has links)
There is consensus among scholars that there is not just one English language but a family of “World Englishes”. The umbrella-term “World Englishes” provides a conceptual framework to accommodate the different varieties of English that have evolved as a result of the linguistic cross-fertilization attendant upon colonization, migration, trade and transplantation of the original “strain” or variety. Various theoretical models have emerged in an attempt to understand and classify the extant and emerging varieties of this global language. The hierarchically based model of English, which classifies world English as “First Language”, “Second Language” and “Foreign Language”, has been challenged by more equitably-conceived models which refer to the emerging varieties as New Englishes. The situation in a country such as multi-lingual South Africa is a complex one: there are 11 official languages, one of which is English. However the English used in South Africa (or “South African English”), is not a homogeneous variety, since its speakers include those for whom it is a first language, those for whom it is an additional language and those for whom it is a replacement language. The Indian population in South Africa are amongst the latter group, as theirs is a case where English has ousted the traditional Indian languages and become a de facto first language, which has retained strong community resonances. This study was undertaken using the methodology of corpus linguistics to initiate the creation of a repository of linguistic evidence (or corpus), of Indian South African English, a sub-variety of South African English (Mesthrie 1992b, 1996, 2002). Although small (approximately 60 000 words), and representing a narrow age band of young adults, the resulting corpus of spoken data confirmed the existence of robust features identified in prior research into the sub-variety. These features include the use of ‘y’all’ as a second person plural pronoun, the use of but in a sentence-final position, and ‘lakker’ /'lVk@/ as a pronunciation variant of ‘lekker’ (meaning ‘good’, ‘nice’ or great’). An examination of lexical frequency lists revealed examples of general South African English such as the colloquially pervasive ‘ja’, ‘bladdy’ (for bloody) and jol(ling) (for partying or enjoying oneself) together with neologisms such as ‘eish’, the latter previously associated with speakers of Black South African English. The frequency lists facilitated cross-corpora comparisons with data from the British National Corpus and the Corpus of London Teenage Language and similarities and differences were noted and discussed. The study also used discourse analysis frameworks to investigate the role of high frequency lexical items such as ‘like’ in the data. In recent times ‘like’ has emerged globally as a lexicalized discourse marker, and its appearance in the corpus of Indian South African English confirms this trend. The corpus built as part of this study is intended as the first building block towards a full corpus of Indian South African English which could serve as a standard for referencing research into the sub-variety. Ultimately, it is argued that the establishment of similar corpora of other known sub-varieties of South African English could contribute towards the creation of a truly representative large corpus of South African English and a more nuanced understanding and definition of this important variety of World English.
|
Page generated in 0.1252 seconds