Global ETD Search

1	The dynamics of collocation: a corpus-based study of the phraseology and pragmatics of the introductory-it construction Mak, King Tong 28 August 2008 (has links) Not available / text Collocation (Linguistics) Computational linguistics
2	A computerized content analysis of Oprah Winfrey's discourse during the James Frey controversy Stephens, Maegan R. January 2008 (has links) This analysis utilizes the computer-based content analysis program DICTION to gain a better understanding of Oprah Winfrey's specific discourse types (praise, blame, and standard) and her language surrounding the James Frey Controversy. Grounded in Social Influence Theory, this thesis argues that is important to understand the language styles of such a significant rhetor in society because she has the potential to influence the public. The findings indicate that Oprah's discourse types differ in the level of Optimism her language represents and that the two episodes of The Oprah Winfrey Show relating to the James Frey Controversy differ in terms of the Certainty. Also, this thesis provides a new application of the program DICTION and the implications for such procedures are discussed. / Department of Communication Studies Discourse analysis -- Data processing. Winfrey, Oprah -- Language.
3	You talking to me? : zero auxiliary constructions in British English Caines, Andrew Paul January 2011 (has links) No description available. 400
4	Infusing Automatic Question Generation with Natural Language Understanding Mazidi, Karen 12 1900 (has links) Automatically generating questions from text for educational purposes is an active research area in natural language processing. The automatic question generation system accompanying this dissertation is MARGE, which is a recursive acronym for: MARGE automatically reads generates and evaluates. MARGE generates questions from both individual sentences and the passage as a whole, and is the first question generation system to successfully generate meaningful questions from textual units larger than a sentence. Prior work in automatic question generation from text treats a sentence as a string of constituents to be rearranged into as many questions as allowed by English grammar rules. Consequently, such systems overgenerate and create mainly trivial questions. Further, none of these systems to date has been able to automatically determine which questions are meaningful and which are trivial. This is because the research focus has been placed on NLG at the expense of NLU. In contrast, the work presented here infuses the questions generation process with natural language understanding. From the input text, MARGE creates a meaning analysis representation for each sentence in a passage via the DeconStructure algorithm presented in this work. Questions are generated from sentence meaning analysis representations using templates. The generated questions are automatically evaluated for question quality and importance via a ranking algorithm. Automatic question generation Computational linguistics. Discourse analysis -- Data processing.
5	Text Mining and Topic Modeling for Social and Medical Decision Support Unknown Date (has links) Effective decision support plays vital roles in people's daily life, as well as for professional practitioners such as health care providers. Without correct information and timely derived knowledge, a decision is often suboptimal and may result in signi cant nancial loss or compromises of the performance. In this dissertation, we study text mining and topic modeling and propose to use text mining methods, in combination with topic models, to discover knowledge from texts popularly available from a wide variety of sources, such as research publications, news, medical diagnose notes, and further employ discovered knowledge to assist social and medical decision support. Examples of such decisions include hospital patient readmission prediction, which is a national initiative for health care cost reduction, academic research topics discovery and trend modeling, and social preference modeling for friend recommendation in social networks etc. To carry out text mining, our research, in Chapter 3, first emphasizes on single document analyzing to investigate textual stylometric features for user pro ling and recognition. Our research confirms that by using properly designed features, it is possible to identify the authors who wrote the article, using a number of sample articles written by the author as the training data. This study serves as the base to assert that text mining is a powerful tool for capturing knowledge in texts for better decision making. In the Chapter 4, we advance our research from single documents to documents with interdependency relationships, and propose to model and predict citation relationship between documents. Given a collection of documents with known linkage relationships, our research will discover e ective features to train prediction models, and predict the likelihood of two documents involving a citation relationships. This study will help accurately model social network linkage relationships, and can be used to assist e ective decision making for friend recommendation in social networking, and reference recommendation in scienti c writing etc. In the Chapter 5, we advance a topic discovery and trend prediction principle to discover meaningful topics from a set of data collection, and further model the evolution trend of the topic. By proposing techniques to discover topics from text, and using temporal correlation between trend for prediction, our techniques can be used to summarize a large collection of documents as meaningful topics, and further forecast the popularity of the topic in a near future. This study can help design systems to discover popular topics in social media, and further assist resource planning and scheduling based on the discovered topics and the their evolution trend. In the Chapter 6, we employ both text mining and topic modeling to the medical domain for effective decision making. The goal is to discover knowledge from medical notes to predict the risk of a patient being re-admitted in a near future. Our research emphasizes on the challenge that re-admitted patients are only a small portion of the patient population, although they bring signficant financial loss. As a result, the datasets are highly imbalanced which often result in poor accuracy for decision making. Our research will propose to use latent topic modeling to carryout localized sampling, and combine models trained from multiple copies of sampled data for accurate prediction. This study can be directly used to assist hospital re-admission assessment for early warning and decision support. The text mining and topic modeling techniques investigated in the dissertation can be applied to many other domains, involving texts and social relationships, towards pattern and knowledge based e ective decision making. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2016. / FAU Electronic Theses and Dissertations Collection Social sciences--Research--Methodology. Data mining. Machine learning. Database searching. Discourse analysis--Data processing. Communication--Network analysis. Medical care--Quality control.
6	Towards a corpus of Indian South African English (ISAE) : an investigation of lexical and syntactic features in a spoken corpus of contemporary ISAE Pienaar, Cheryl Leelavathie January 2008 (has links) There is consensus among scholars that there is not just one English language but a family of “World Englishes”. The umbrella-term “World Englishes” provides a conceptual framework to accommodate the different varieties of English that have evolved as a result of the linguistic cross-fertilization attendant upon colonization, migration, trade and transplantation of the original “strain” or variety. Various theoretical models have emerged in an attempt to understand and classify the extant and emerging varieties of this global language. The hierarchically based model of English, which classifies world English as “First Language”, “Second Language” and “Foreign Language”, has been challenged by more equitably-conceived models which refer to the emerging varieties as New Englishes. The situation in a country such as multi-lingual South Africa is a complex one: there are 11 official languages, one of which is English. However the English used in South Africa (or “South African English”), is not a homogeneous variety, since its speakers include those for whom it is a first language, those for whom it is an additional language and those for whom it is a replacement language. The Indian population in South Africa are amongst the latter group, as theirs is a case where English has ousted the traditional Indian languages and become a de facto first language, which has retained strong community resonances. This study was undertaken using the methodology of corpus linguistics to initiate the creation of a repository of linguistic evidence (or corpus), of Indian South African English, a sub-variety of South African English (Mesthrie 1992b, 1996, 2002). Although small (approximately 60 000 words), and representing a narrow age band of young adults, the resulting corpus of spoken data confirmed the existence of robust features identified in prior research into the sub-variety. These features include the use of ‘y’all’ as a second person plural pronoun, the use of but in a sentence-final position, and ‘lakker’ /'lVk@/ as a pronunciation variant of ‘lekker’ (meaning ‘good’, ‘nice’ or great’). An examination of lexical frequency lists revealed examples of general South African English such as the colloquially pervasive ‘ja’, ‘bladdy’ (for bloody) and jol(ling) (for partying or enjoying oneself) together with neologisms such as ‘eish’, the latter previously associated with speakers of Black South African English. The frequency lists facilitated cross-corpora comparisons with data from the British National Corpus and the Corpus of London Teenage Language and similarities and differences were noted and discussed. The study also used discourse analysis frameworks to investigate the role of high frequency lexical items such as ‘like’ in the data. In recent times ‘like’ has emerged globally as a lexicalized discourse marker, and its appearance in the corpus of Indian South African English confirms this trend. The corpus built as part of this study is intended as the first building block towards a full corpus of Indian South African English which could serve as a standard for referencing research into the sub-variety. Ultimately, it is argued that the establishment of similar corpora of other known sub-varieties of South African English could contribute towards the creation of a truly representative large corpus of South African English and a more nuanced understanding and definition of this important variety of World English. English language -- South Africa Computational linguistics -- Methodology Discourse analysis -- Data processing East Indians -- South Africa -- Language Lexicology

1

Page generated in 0.1324 seconds