Spelling suggestions: "subject:"corpora"" "subject:"korpora""
211 |
Acquisition et Expression Multimodale de la Négation. Étude d'un Corpus Vidéo et Longitudinal de Dyades Mère-Enfant Francophone et Anglophone. / Multimodal acquisition and expression of negation. Analysis of a videotaped and longitudinal corpus of a French and an English mother-child dyad.Beaupoil-Hourdel, Pauline 27 November 2015 (has links)
Cette thèse porte sur l'acquisition et le développement de la négation chez deux enfants monolingues anglaise et française, filmées entre 10 mois et 4 ans et 2 mois (66h) en interactions naturelles avec leur mère. Nous adoptons une perspective constructiviste et fonctionnaliste de la langue (Tomasello 2003) en tissant des liens avec la théorie des opérations énonciatives, la socialisation langagière et avec les études sur la gestualité. Notre définition du langage est large car nous analysons toutes les ressources sémiotiques dont le locuteur dispose pour se positionner en interaction. À l'aide d'un système de codage multimodal qui repose sur l'utilisation de logiciels compatibles, nous menons des analyses qualitatives et quantitatives de l'usage des modalités verbales et non-verbales pour l'expression de la négation chez l'enfant avant 4 ans.Après avoir présenté l'ancrage théorique (partie 1) et notre méthode (partie 2), nous montrons que la négation correspond à un grand nombre de fonctions pragmatiques qui sont exprimées à l'aide de la synchronisation de modalités distinctes (partie 3). Les résultats indiquent que distinguer le rôle des modalités dans la construction de l’énoncé permet de travailler sur la complexité du langage. Concernant la négation, nous observons qu’il s’agit d’une opération énonciative qui ne repose pas systématiquement sur les mêmes formes selon la fonction exprimée.Cette recherche montre que l'usage synchronisé de plusieurs modalités en contexte de négation est une compétence linguistique et cognitive. En outre, les formes négatives s’enrichissent et se spécialisent après 3 ans pour permettre l’expression d’intentions communicatives variées. / This research focuses on the acquisition and the development of negation in two monolingual French and English children filmed from 10 months to 4 years and 2 months old (66 hours) in natural mother-child dyadic interactions.We use a functionalist and constructivist theoretical approach (Tomasello 2003) but we also bring together French utter-centred approach to language, language socialisation and gesture studies. Our definition of language encompasses all verbal and non-verbal means of expression speakers use to position themselves within interaction. We developed a multimodal coding system relying on the use of several compatible programs to combine qualitative and quantitative analyses. This method offers the opportunity to investigate the expression of negation in verbal and non-verbal modalities in children under 4.After laying the theoretical background (Part 1), we will present our methodology (Part 2). Results show that negation refers to a vast range of pragmatic functions whose expression is fully embodied because it is conveyed through the synchronisation of several modalities of expression (Part 3). Our analysis of the interplay of modalities in the construction of meaning happens to be a great locus to account for the complexity of language. We also observe that negation is a meta-category which can be expressed by a variety of forms.Our research shows that the usage of synchronised modalities in negative contexts can be considered a linguistic and cognitive skill. Moreover, the set of forms for negation develops and specialises after 3 years and helps the child express various communicative intentions linked to negation.
|
212 |
La communication des émotions chez l’enfant (colère, joie, tristesse) ; études de cas et confrontation de théories linguistiques / The communication of emotions in children and adultsKhaled, Fazia 03 December 2016 (has links)
Cette thèse propose une analyse multimodale de l’expression des émotions chez deux enfants américaines et leurs parents monolingues. Les enfants ont été filmées entre 11 mois et 3 and et 10 mois pour l’une et entre 1 an et 1 mois et 4 ans pour l’autre au cours d’interactions spontanées en milieu familial. Nous adoptons une définition du langage large car toutes les ressources sémiotiques sont à prendre en compte : ressources verbales (lexique, marqueurs grammaticaux), vocales (vocalisations), gestuelles et corporelles (gestes, expressions faciales, actions).Nous nous concentrons sur l’acquisition et le développement des marqueurs verbaux et non verbaux exprimant les émotions chez l’enfant et sur l’usage de ces marqueurs chez l’adulte. Nous montrons que des profils expressifs bien précis et distincts semblent déjà émerger chez les enfants, grandement influencés par l’input auquel ils sont exposés chaque jour.Au plan théorique, notre recherche s’inscrit dans une approche constructiviste et fonctionnaliste de la langue (Tomasello, 2003) et nous analysons les données à l’aune de la socialisation langagière, et des études sur la gestualité et les expressions faciales comme vecteurs d’informations communicationnelles. Au plan méthodologique, nous réalisons des analyses quantitatives et qualitatives afin d’éclairer les comportements propres à chaque locuteur.Après avoir exposé notre socle théorique et notre méthodologie (partie I), nous révélons nos résultats sur l’expression de trois émotions (colère, joie, et tristesse) chez les locuteurs adultes et enfants (partie II). Nos résultats suggèrent que le développement linguistique des enfants n’a pas d’incidence sur l’expression de leurs émotions, mais que l’input et les attitudes parentales jouent un rôle majeur dans l’acquisition et le développement de chaque modalité et dans la transmission de modèles expressifs. / This research provides a multimodal analysis of the expression of emotion in two monolingual American children and their parents. The children were filmed in natural interactions in a family setting from the ages of 11 months to 3 years 10 months, and from 1 year 1 month to 4 years.We adopted a broad definition of language in this research which encompasses various semiotic resources – from verbal resources (lexicon and grammatical features), to nonverbal (vocalizations, facial expressions, and gestures). We focus on the children’s acquisition and development of these verbal and nonverbal markers and on how they are used by their parents. Our research shows that children develop specific and distinct communicational patterns, which are greatly influenced by the input to which they are exposed.From a theoretical perspective, our research draws from a constructivist and functionalist approach (Tomasello, 2003), and our data is analyzed in light of language socialization and of studies which have shown that facial expressions and gestures are used as communicational signals in face-to-face dialogue. Our methodology combines quantitative and qualitative methods to investigate each speaker’s verbal and nonverbal behavior when expressing emotions.Having outlined our theoretical and methodological foundation (Part I), we present our results on the expression of three emotions (happiness, sadness, and anger) in children and adults (Part II). Our research suggests that while children’s linguistic development has little impact on the richness of their emotional expression parental input and attitudes both play a crucial role in the acquisition of each modality and in the transmission of communicational patterns.
|
213 |
比較HAPPEN與其同義字: 以母語及學習者語料庫為基礎的非賓格存現動詞之研究 / Comparing unaccusative HAPPEN and its synonyms: a study of existence/appearance verbs based on native speaker and learner corpora王亮鈞, Wang, Liang Chun Unknown Date (has links)
本研究,基於分辨非賓存現動詞及瞭解二語學習者如何讓習得此類動詞之需求,旨在分析一個高頻率之非賓格存現動詞 HAPPEN與其三個同義字(OCCUR,APPEAR,與EXIST)和中文同義字「發生」從語言使用者角度作比較。採用了母語語料庫 (英文採用英國國家語料庫 BNC;中文採用十億詞語料庫 GW 2.0)及學習者語料庫(含語言訓練與測驗中心學習者語料庫the LTTC,國際英語學習者語料庫the ICLE,及政治大學外語學習者語料庫the NCCU)作為第一部分的語料庫分析。此外,為了探索二語英文錯誤及母語中文遷移的關係,我們也進行了以語料庫為基礎的心理語言學實驗(兩個關於中英文HAPPEN句子結構的接受度判斷測驗)。
本研究結果發現,其一,就語料庫中的文法形式(Grammatical form)來分析HAPPEN、OCCUR、APPEAR與EXIST,英文母語語料庫中的高頻文法形式(例如:happened或happen)與學習者語料庫中有相同的現象。然而大部份的高頻文法形式都是二語學習者經常誤用之處,且容易與兩個常見非賓動詞錯誤—過度被動化錯誤(Overpassivization)和及物化錯誤(Transitivization)—共現(Collocated)。其二,從語料庫錯誤分析各種錯誤類型得知, HAPPEN與OCCUR較常出現過度被動化錯誤;APPEAR與 EXIST較常有及物化錯誤。此結果顯示每個非賓存現動詞可能會犯不同錯誤,也因此造成其錯誤的原因有所不同。其三,從分析心理語言實驗結果得知,我們發現母語中文文法句型(L1 Chinese grammatical patterns),例如:「V-了」-「出現了」;抑或是「V+N」-「發生車禍」、「發生戰爭」、「存在缺失」,都影響了二語學習者對英文非賓動詞之文法形式的正確判定。由此揭示了母語中文大多都對二語英文非賓動詞習得有所干擾。
基於所得結果,我們提出「完成體」(Perfectivity)及「及物性」(Transitivity)之不同來探討中英文間存現動詞用法之異同,並試著解釋造成二語非賓動詞學習複雜化的原因。
此研究克服了過去文獻中比較非賓存現動詞之困難也透過語料庫結合心理實驗研究法提供對非賓動詞習得之解釋方法。這些發現可進一步作為詮釋非賓動詞的假說,並將其應用於語言教材設計或被視為未來跨語言分析研究之基石。 / Owing to the necessity to identify unaccusative existence/appearance verbs and realize how they are acquired by L2 learners, this present thesis aims to analyze a highly frequent English unaccusative verb HAPPEN and compare it with its three other synonyms (OCCUR, APPEAR, and EXIST), as well as its Chinese counterpart發生 fāshēn ‘happen.’ Native speaker corpora (the British National Corpus (BNC) for English and Chinese Gigaword 2 Corpus (GW 2.0) for the Chinese), and L2 learner corpora (the Language Training and Testing Learner Corpus (the LTTC), International Corpus of Learner English 2.0 (the ICLE), and the National Chengchi University Foreign Language Learner Corpus (the NCCU)) are utilized to analyze the unaccusative verbs in the first main section. In addition, in order to discover the relationship between L2 English errors and L1 Chinese transfer, psycholinguistic experiments (two acceptability judgments tasks with comparable Chinese and English HAPPEN sentence constructions) based on the corpora data were conducted in this thesis.
The results in this thesis showed that, first, the highly frequent grammatical forms of unaccusative verbs (e.g., happened or happen) in the English native speaker corpus share some similarities with those of L2 learner corpora. However, these grammatical forms were usually misused by L2 learners and were frequently collocated with the two common unaccusative errors (overpassivization, e.g., *What is happened? and trasitivization, e.g., *I happen a car accident.). Second, as for the distributions of unaccusative error types, HAPPEN and OCCUR were found to mainly co-occur with overpassivization errors, whereas APPEAR and EXIST were found to mainly co-occur with transitivization errors. This indicates that each unaccusative verb may have different potential for L2 unaccusative errors, and therefore the causes of these errors with different verbs may vary. Third, from the analysis of psycholinguistic experiments, we discover that the L1 Chinese grammatical patterns, such as the V-le grammatical pattern (e.g., 出現了chūxiànle ‘appear-le’) and the V+N grammatical pattern (e.g., 發生車禍fāshēngchēhuò ‘The car accident happened’, 發生戰爭 fāshēngzhànzhēng ‘The war occurred’, and存在缺失 cúnzàiquēshī ‘The pitfalls existed’) may influence L2 learners’ correct judgment as to the grammatical forms of unaccusative verbs. This reveals that generally L1 Chinese might have some interference with L2 unaccusative acquisition.
Based on the results, we proposed that the perfectivity and transitivity differences between English and Chinese unaccusative existence/appearance verbs could distinguish the uses among the English HAPPEN and the Chinese發生 fāshēn ‘happen’ with their synonyms. These differences could also provide a possible reason for the cause of the problematic L2 unaccusative acquisition.
This thesis overcomes the difficulties of comparing unaccusative existence/appearance verbs in the previous studies and attempts to unravel the enigma of acquiring this verb type from the integrated corpus-based and empirical findings. These findings in turn serve as the suggested assumptions to interpret unaccusative verbs, which can be applied to the design of language teaching materials or can be viewed as the basis of cross-language analysis in the future studies.
|
214 |
Developing Multimodal Spoken Dialogue Systems : Empirical Studies of Spoken Human–Computer InteractionGustafson, Joakim January 2002 (has links)
This thesis presents work done during the last ten years on developing five multimodal spoken dialogue systems, and the empirical user studies that have been conducted with them. The dialogue systems have been multimodal, giving information both verbally with animated talking characters and graphically on maps and in text tables. To be able to study a wider rage of user behaviour each new system has been in a new domain and with a new set of interactional abilities. The five system presented in this thesis are: The Waxholm system where users could ask about the boat traffic in the Stockholm archipelago; the Gulan system where people could retrieve information from the Yellow pages of Stockholm; the August system which was a publicly available system where people could get information about the author Strindberg, KTH and Stockholm; the AdAptsystem that allowed users to browse apartments for sale in Stockholm and the Pixie system where users could help ananimated agent to fix things in a visionary apartment publicly available at the Telecom museum in Stockholm. Some of the dialogue systems have been used in controlled experiments in laboratory environments, while others have been placed inpublic environments where members of the general public have interacted with them. All spoken human-computer interactions have been transcribed and analyzed to increase our understanding of how people interact verbally with computers, and to obtain knowledge on how spoken dialogue systems canutilize the regularities found in these interactions. This thesis summarizes the experiences from building these five dialogue systems and presents some of the findings from the analyses of the collected dialogue corpora. / QC 20100611
|
215 |
Influssi e riflessi della lingue indiane sul british english: analisi dei prestiti e della produttività lessicale in prospettiva diacronica e sincronica / Influxes and Reflexes from East Indian Languages on British English: Analysis of the Borrowings and of Lexical Productivity in both Diachronic and Synchronic PerspectiveGORLA, CHIARA 07 April 2008 (has links)
La tesi si concentra sugli influssi lessicali che le lingue indiane hanno esercitato sulla lingua inglese sia in prospettiva diacronica sia sincronica. La prima parte dell'elaborato indaga, tramite l'impiego di uno strumento lessicografico, l'Oxford English Dictionary edizione on-line, la presenza in inglese di prestiti veri e propri, ma anche di derivati e composti, sorti in seguito al contatto tra l'inglese le lingue indiane a partire dal Sedicesimo secolo e fino ai nostri giorni, arrivando a individuare 1791 forme lessicali. La seconda parte intende verificare l'effettiva presenza, la frequenza d'uso e il significato di tali prestiti, composti e derivati nel British English contemporaneo, avvalendosi degli strumenti offerti dalla linguistica dei corpora. Il corpus di riferimento impiegato in questa seconda fase della ricerca è Bank of English. L'elaborato, oltre a delineare lo scenario storico culturale di riferimento, mette in evidenza le procedure metodologiche impiegate, e ricostruisce l'impianto teorico sulle questioni di interferenze tra codici linguistici, lingue in contatto e prestiti lessicali, riferendosi ai maggiori e più recenti studi in materia. / The research focuses on lexical influences exerted by Indian languages on British English as a result of linguistic contacts between Great Britain and India. Both diachronic and synchronic perspectives are taken into consideration in evaluating the extent of such lexical influences. The first part of the research analyses the presence of words of East Indian origin in English by means of the Oxford English Dictionary, on-line edition, be these words authentic lexical borrowings, or derivatives or compounds arisen as a consequence of such linguistic contacts. The historical period taken into consideration goes from the 16th century till nowadays. The second part of the research aims to verify the actual presence, frequency of usage and meaning of such words in contemporary British English by means of a linguistic corpora tool, namely the Bank of English by Harper Collins. The historical and cultural background of the relationships between Great Britain and India, as well as the theoretical background about linguistic interferences as a whole are also illustrated, with reference to the most authoritative and recent studies.
|
216 |
L’extraction de phrases en relation de traduction dans WikipédiaRebout, Lise 06 1900 (has links)
Afin d'enrichir les données de corpus bilingues parallèles, il peut être judicieux de travailler avec des corpus dits comparables. En effet dans ce type de corpus, même si les documents dans la langue cible ne sont pas l'exacte traduction de ceux dans la langue source, on peut y retrouver des mots ou des phrases en relation de traduction.
L'encyclopédie libre Wikipédia constitue un corpus comparable multilingue de plusieurs millions de documents. Notre travail consiste à trouver une méthode générale et endogène permettant d'extraire un maximum de phrases parallèles. Nous travaillons avec le couple de langues français-anglais mais notre méthode, qui n'utilise aucune ressource bilingue extérieure, peut s'appliquer à tout autre couple de langues.
Elle se décompose en deux étapes. La première consiste à détecter les paires d’articles qui ont le plus de chance de contenir des traductions. Nous utilisons pour cela un réseau de neurones entraîné sur un petit ensemble de données constitué d'articles alignés au niveau des phrases. La deuxième étape effectue la sélection des paires de phrases grâce à un autre réseau de neurones dont les sorties sont alors réinterprétées par un algorithme d'optimisation combinatoire et une heuristique d'extension.
L'ajout des quelques 560~000 paires de phrases extraites de Wikipédia au corpus d'entraînement d'un système de traduction automatique statistique de référence permet d'améliorer la qualité des traductions produites.
Nous mettons les données alignées et le corpus extrait à la disposition de la communauté scientifique. / Working with comparable corpora can be useful to enhance bilingual parallel corpora. In fact, in such corpora, even if the documents in the target language are not the exact translation of those in the source language, one can still find translated words or sentences.
The free encyclopedia Wikipedia is a multilingual comparable corpus of several millions of documents. Our task is to find a general endogenous method for extracting a maximum of parallel sentences from this source. We are working with the English-French language pair but our method -- which uses no external bilingual resources -- can be applied to any other language pair.
It can best be described in two steps. The first one consists of detecting article pairs that are most likely to contain translations. This is achieved through a neural network trained on a small data set composed of sentence aligned articles. The second step is to perform the selection of sentence pairs through another neural network whose outputs are then re-interpreted by a combinatorial optimization algorithm and an extension heuristic.
The addition of the 560~000 pairs of sentences extracted from Wikipedia to the training set of a baseline statistical machine translation system improves the quality of the resulting translations.
We make both the aligned data and the extracted corpus available to the scientific community.
|
217 |
Σύνθεση με δεσμευμένο θέμα στην Αγγλική και τη νέα Ελληνική : θεωρητική ανάλυση και υπολογιστική επεξεργασίαΠετροπούλου, Ευανθία 05 March 2012 (has links)
Η παρούσα διατριβή ασχολείται με τη συγκριτική μελέτη λέξεων στην Αγγλική
και τη Νέα Ελληνική που περιέχουν δεσμευμένα μορφολογικά στοιχεία, δηλαδή
μορφήματα που δεν απαντώνται ανεξάρτητα στο λόγο. Στην Αγγλική οι λέξεις αυτές
είναι γνωστές με τον όρο «νεοκλασικά σύνθετα», καθώς τα δεσμευμένα στοιχεία που
περιέχουν έχουν αρχαιοελληνική ή λατινική προέλευση. Στη Νέα Ελληνική πρόκειται
για μία κατηγορία ρηματικών συνθέτων, που περιέχουν ένα δεσμευμένο θέμα
ρηματικής προέλευσης σε τελική θέση. Σημαντικό χαρακτηριστικό των εν λόγω
λέξεων είναι ότι ένα μεγάλο ποσοστό αυτών ανήκει σε ειδικά επιστημονικά και
τεχνικά λεξιλόγια. Στόχος της μελέτης είναι ο άμεσος συσχετισμός των λέξεων
αυτών στις δύο γλώσσες από μορφολογική άποψη, με απώτερο σκοπό τη βέλτιστη
υπολογιστική τους επεξεργασία.
Για τις ανάγκες της παρούσας συγκροτήθηκαν δύο σώματα δεσμευμένων
θεμάτων που εμφανίζονται σε τελική θέση μέσα σε λέξεις της Αγγλικής και της Νέας
Ελληνικής, τα οποία αποτέλεσαν τα κύρια γλωσσικά δεδομένα, τόσο για την
θεωρητική ανάλυση, όσο και για την υπολογιστική επεξεργασία των υπό εξέταση
λέξεων. Λαμβάνοντας υπόψη τις θεωρητικές απόψεις που έχουν διατυπωθεί, τα
δεδομένα που προέκυψαν από την εξέταση των σωμάτων δεσμευμένων στοιχείων,
καθώς και την παρουσία των λέξεων με δεσμευμένα στοιχεία κατά τη διάρκεια
εξέλιξης της κάθε γλώσσας, πραγματοποιείται η μορφολογική τους ανάλυση, που
έχει ως αποτέλεσμα τον άμεσο συσχετισμό τους με την αναγνώριση αντίστοιχων
δομών στις εν λόγω λέξεις.
Με βάση τα συμπεράσματα της θεωρητικής ανάλυσης επιχειρείται η δημιουργία
ενός συστήματος υπολογιστικής μορφολογικής επεξεργασίας των εν λόγω λέξεων,
που διέπεται από κοινές αρχές και κοινούς κανόνες για τις δύο γλώσσες, με στόχο
την όσο το δυνατόν αποτελεσματικότερη και οικονομικότερη περιγραφή του
φαινομένου, για χρήση σε ένα πλήθος εφαρμογών της γλωσσικής επεξεργασίας. Ο
φορμαλισμός που χρησιμοποιείται αποτελεί έναν λεξιλογικό μεταγλωττιστή (LEXC)
που είναι ιδιαίτερα κατάλληλος για τον ορισμό λεξιλογίων φυσικών γλωσσών και
βασίζεται στις μεθόδους πεπερασμένων καταστάσεων. / This PhD thesis deals with the comparative study of words containing bound morphological stems in English and Modern Greek (MG) and their computational processing. A great number of these words, also known as neoclassical compounds, belong to technical and scientific terminologies. The linguistic database used for the theoretical analysis and the computational processing proposed in this study, consists of two corpora, especially built for and appended in the current study, containing bound stems as final elements in words of English and MG. According to the theoretical analysis proposed, compounds with bound stems in the two languages share similar structures. The proposed system for the computational processing of these words, based on the the conclusions of the theoretical analysis, makes use of finite state methods, specifically the Xerox Lexical Compiler (LEXC) and offers an efficient way of implementing the phenomenon of neoclassical compounding in modern languages.
|
218 |
A corpus driven computational intelligence framework for deception detection in financial textMinhas, Saliha Z. January 2016 (has links)
Financial fraud rampages onwards seemingly uncontained. The annual cost of fraud in the UK is estimated to be as high as £193bn a year [1] . From a data science perspective and hitherto less explored this thesis demonstrates how the use of linguistic features to drive data mining algorithms can aid in unravelling fraud. To this end, the spotlight is turned on Financial Statement Fraud (FSF), known to be the costliest type of fraud [2]. A new corpus of 6.3 million words is composed of102 annual reports/10-K (narrative sections) from firms formally indicted for FSF juxtaposed with 306 non-fraud firms of similar size and industrial grouping. Differently from other similar studies, this thesis uniquely takes a wide angled view and extracts a range of features of different categories from the corpus. These linguistic correlates of deception are uncovered using a variety of techniques and tools. Corpus linguistics methodology is applied to extract keywords and to examine linguistic structure. N-grams are extracted to draw out collocations. Readability measurement in financial text is advanced through the extraction of new indices that probe the text at a deeper level. Cognitive and perceptual processes are also picked out. Tone, intention and liquidity are gauged using customised word lists. Linguistic ratios are derived from grammatical constructs and word categories. An attempt is also made to determine ‘what’ was said as opposed to ‘how’. Further a new module is developed to condense synonyms into concepts. Lastly frequency counts from keywords unearthed from a previous content analysis study on financial narrative are also used. These features are then used to drive machine learning based classification and clustering algorithms to determine if they aid in discriminating a fraud from a non-fraud firm. The results derived from the battery of models built typically exceed classification accuracy of 70%. The above process is amalgamated into a framework. The process outlined, driven by empirical data demonstrates in a practical way how linguistic analysis could aid in fraud detection and also constitutes a unique contribution made to deception detection studies.
|
219 |
Textual representations of migrants and the process of migration in selected South African media a combined critical discourse analysis and corpus linguistics studyCrymble, Leigh January 2011 (has links)
South Africa has long been associated with racial and ethnic issues surrounding prejudice and discrimination and despite a move post-1994 to a democratic ‘rainbow nation’ society, the country has remained plagued by unequal power relations. One such instance of inequality relates to the marginalisation of migrants which has been realised through xenophobic attitudes and actions, most notably the violence that swept across the country in 2008. Several reasons have been suggested in an attempt to explain the cause of the violence, including claims that migrants are taking ‘our jobs and our women’, migrants are ‘illegal and criminal’ and bringing ‘disease and contamination’ with them from their countries of origin. Although widely accepted that many, if not all, of these beliefs are based on ignorance and hearsay, these extensive generalisations shape and reinforce prejudiced ideologies about migrant communities. It is thus only when confronted with evidence that challenges this dominant discourse, that South Africans are able to reconsider their views. Williams (2008) suggests that for many South Africans, Africa continues to be the ‘dark continent’ that is seen as an ominous, threatening force of which they have very little knowledge. For this reason, anti-immigrant sentiment in a South African context has traditionally been directed at African foreigners. In this study I examine the ways in which African migrants and migrant communities, as well as the overall processes of migration, are depicted by selected South African print media: City Press, Mail & Guardian and Sunday Times. Using a combined Corpus Linguistics and Critical Discourse Analysis approach, I investigate the following questions: How are migrants and the process of migration into South Africa represented by these established newspapers between 2006 and 2010? Are there any differences or similarities between these representations? In particular, what ideologies regarding migrants and migrant communities underlie these representations? My analysis focuses on the landscape of public discourse about migration with an exploration of the rise and fall of the terminologies used to categorise migrants and the social implications of these classifications. Additionally, I analyse the expansive occurrences of negative representations of migrants, particularly through the use of ‘othering’ pronouns ‘us’ versus ‘them’ and through the use of metaphorical language which largely depicts these individuals as en masse natural disasters. I conclude that these discursive elements play a crucial role in contributing to an overall xenophobic rhetoric. Despite subtle differences between the three newspapers which can be accounted for based on their political persuasions and agendas, it is surprising to note how aligned these publications are with regard to their portrayal of migrants. With a few exceptions, this representation positions these individuals as powerless and disenfranchised and maintains the status quo view of migrants as burdens on the South African economy and resources. Overall, the newspaper articles contribute to mainstream dominant discourse on migrants and migration with the underlying ideology that migrants are responsible for the hardships suffered by South African citizens. Thus, this study contributes significantly to existing bodies of research detailing discourse on migrants and emphasises the intrinsic links between language, ideology and society.
|
220 |
Lexical levels and formulaic language : an exploration of undergraduate students' vocabulary and written production of delexical multiword unitsScheepers, Ruth Angela 11 1900 (has links)
This study investigates undergraduate students’ vocabulary size, and their use of formulaic language. Using the Vocabulary Levels Test (Laufer and Nation 1995), it measures the vocabulary size of native and non-native speakers of English and explores relationships between this and course of study, gender, age and home language, and their academic performance. A corpus linguistic approach is then applied to compare student writers’ uses of three high-frequency verbs (have, make and take) relative to expert writers. Multiword units (MWUs) featuring these verbs are identified and analysed, focusing on delexical MWUs as one very specific aspect of depth of vocabulary knowledge. Student and expert use of these MWUs is compared. Grammatically and semantically deviant MWUs are also analysed. Finally, relationships between the size and depth of students’ vocabulary knowledge, and between the latter and academic performance, are explored.
Findings reveal that Literature students had larger vocabularies than Law students, females knew more words than males, and older students knew more than younger ones. Importantly, results indicated a relationship between vocabulary size and academic performance. Literature students produced more correct MWUs and fewer errors than Law students. Correlations suggest that the smaller students’ vocabulary, the poorer the depth of their vocabulary is likely to be. Although no robust relationship between vocabulary depth and academic performance emerged, there was evidence of an indirect link between academic performance and correct use of MWUs.
In bringing together traditional methods of measuring vocabulary size with an investigation of depth of vocabulary knowledge using corpus analysis methods, this study provides further evidence of the importance of vocabulary knowledge to academic performance. It contributes to debates on the value of a sound knowledge of high-frequency vocabulary and a developing knowledge of at least 5000 words to academic performance, and the analysis and quantification of errors in MWUs adds to our understanding of novice writers’ difficulties with these combinations. The study also explores new ways of investigating relationships between size and depth of vocabulary knowledge, and between depth of vocabulary knowledge and academic performance. / Linguistics and Modern Languages / D. Litt. et Phil. (Linguistics)
|
Page generated in 0.0586 seconds