Global ETD Search

11	'Healthy' Coreference: Applying Coreference Resolution to the Health Education Domain Hirtle, David Z. January 2008 (has links) This thesis investigates coreference and its resolution within the domain of health education. Coreference is the relationship between two linguistic expressions that refer to the same real-world entity, and resolution involves identifying this relationship among sets of referring expressions. The coreference resolution task is considered among the most difficult of problems in Artificial Intelligence; in some cases, resolution is impossible even for humans. For example, "she" in the sentence "Lynn called Jennifer while she was on vacation" is genuinely ambiguous: the vacationer could be either Lynn or Jennifer. <br/><br/> There are three primary motivations for this thesis. The first is that health education has never before been studied in this context. So far, the vast majority of coreference research has focused on news. Secondly, achieving domain-independent resolution is unlikely without understanding the extent to which coreference varies across different genres. Finally, coreference pervades language and is an essential part of coherent discourse. Its effective use is a key component of easy-to-understand health education materials, where readability is paramount. <br/><br/> No suitable corpus of health education materials existed, so our first step was to create one. The comprehensive analysis of this corpus, which required manual annotation of coreference, confirmed our hypothesis that the coreference used in health education differs substantially from that in previously studied domains. This analysis was then used to shape the design of a knowledge-lean algorithm for resolving coreference. This algorithm performed surprisingly well on this corpus, e.g., successfully resolving over 85% of all pronouns when evaluated on unseen data. <br/><br/> Despite the importance of coreferentially annotated corpora, only a handful are known to exist, likely because of the difficulty and cost of reliably annotating coreference. The paucity of genres represented in these existing annotated corpora creates an implicit bias in domain-independent coreference resolution. In an effort to address these issues, we plan to make our health education corpus available to the wider research community, hopefully encouraging a broader focus in the future. coreference resolution anaphora computational linguistics natural language processing corpus analysis health education Computer Science
12	'Healthy' Coreference: Applying Coreference Resolution to the Health Education Domain Hirtle, David Z. January 2008 (has links) This thesis investigates coreference and its resolution within the domain of health education. Coreference is the relationship between two linguistic expressions that refer to the same real-world entity, and resolution involves identifying this relationship among sets of referring expressions. The coreference resolution task is considered among the most difficult of problems in Artificial Intelligence; in some cases, resolution is impossible even for humans. For example, "she" in the sentence "Lynn called Jennifer while she was on vacation" is genuinely ambiguous: the vacationer could be either Lynn or Jennifer. <br/><br/> There are three primary motivations for this thesis. The first is that health education has never before been studied in this context. So far, the vast majority of coreference research has focused on news. Secondly, achieving domain-independent resolution is unlikely without understanding the extent to which coreference varies across different genres. Finally, coreference pervades language and is an essential part of coherent discourse. Its effective use is a key component of easy-to-understand health education materials, where readability is paramount. <br/><br/> No suitable corpus of health education materials existed, so our first step was to create one. The comprehensive analysis of this corpus, which required manual annotation of coreference, confirmed our hypothesis that the coreference used in health education differs substantially from that in previously studied domains. This analysis was then used to shape the design of a knowledge-lean algorithm for resolving coreference. This algorithm performed surprisingly well on this corpus, e.g., successfully resolving over 85% of all pronouns when evaluated on unseen data. <br/><br/> Despite the importance of coreferentially annotated corpora, only a handful are known to exist, likely because of the difficulty and cost of reliably annotating coreference. The paucity of genres represented in these existing annotated corpora creates an implicit bias in domain-independent coreference resolution. In an effort to address these issues, we plan to make our health education corpus available to the wider research community, hopefully encouraging a broader focus in the future. coreference resolution anaphora computational linguistics natural language processing corpus analysis health education Computer Science
13	Discourse-givenness of noun phrases : theoretical and computational models Ritz, Julia January 2013 (has links) This thesis gives formal definitions of discourse-givenness, coreference and reference, and reports on experiments with computational models of discourse-givenness of noun phrases for English and German. Definitions are based on Bach's (1987) work on reference, Kibble and van Deemter's (2000) work on coreference, and Kamp and Reyle's Discourse Representation Theory (1993). For the experiments, the following corpora with coreference annotation were used: MUC-7, OntoNotes and ARRAU for Englisch, and TueBa-D/Z for German. As for classification algorithms, they cover J48 decision trees, the rule based learner Ripper, and linear support vector machines. New features are suggested, representing the noun phrase's specificity as well as its context, which lead to a significant improvement of classification quality. / Die vorliegende Arbeit gibt formale Definitionen der Konzepte Diskursgegebenheit, Koreferenz und Referenz. Zudem wird über Experimente berichtet, Nominalphrasen im Deutschen und Englischen hinsichtlich ihrer Diskursgegebenheit zu klassifizieren. Die Definitionen basieren auf Arbeiten von Bach (1987) zu Referenz, Kibble und van Deemter (2000) zu Koreferenz und der Diskursrepräsentationstheorie (Kamp und Reyle, 1993). In den Experimenten wurden die koreferenzannotierten Korpora MUC-7, OntoNotes und ARRAU (Englisch) und TüBa-D/Z (Deutsch) verwendet. Sie umfassen die Klassifikationsalgorithmen J48 (Entscheidungsbäume), Ripper (regelbasiertes Lernen) und lineare Support Vector Machines. Mehrere neue Klassifikationsmerkmale werden vorgeschlagen, die die Spezifizität der Nominalphrase messen, sowie ihren Kontext abbilden. Mit Hilfe dieser Merkmale kann eine signifikante Verbesserung der Klassifikation erreicht werden. Diskursgegebenheit Klassifikator Koreferenz Kontext tf-idf discourse-givenness classifier coreference context tf-idf Language, Linguistics
14	Apposition en français contemporain: description, position, fonction, fréquence. Comparaison avec le tch\`{e}que. / Apposition in the contemporary French: description, position, function, frequency. Comparison with Czech. DAŇKOVÁ, Klára January 2015 (has links) The first aim of this work is to describe the way in which the term apposition is defined in French and Czech linguistics. The second aim is to examine the use of one type of French apposition in journalistic and legal texts and to find out which equivalents are used in the Czech language for these expressions. The work is divided into theoretical and practical part. The theoretical part includes a description of different approaches of apposition in the French and Czech language. The practical part begins with a choice of one definition of apposition which will be further used in the corpus analysis. The corpus analysis is conducted by using the corpus InterCorp and its subject is to examine the function and frequency of French apposition in journalistic and legal texts and furthermore to analyse its Czech equivalents.
15	Nominální anafora a determinace ve starofrancouzském románu Aucassin et Nicolette / Nominal Anaphora and Determination in the Old-French Novel Aucassin et Nicolette ŽEMLIČKOVÁ, Klára January 2015 (has links) The thesis deals with the relationship between anaforization of name syntagma and determination , which can be seen as an exponent of the anaforization process. The purpose of this work is to evaluate what role had the individual determinants in old French in comparison to their role in contemporary French. The thesis is based on the specialized literature and analysis of anaforical name syntagmas in the text Aucassin et Nicolette. The work is divided into two main parts. The first theoretical part is focused on defining the basic concepts using specialized literature on which the analysis is based and the second part is mainly the analysis itself of nominal syntagmas in the text Aucassin et Nicolette.
16	Koheze a koherence textu komiksových příběhů (Astérix a Obélix). / Cohesion and coherence of the comic books´ text (Astérix and Obélix). MUŽÍČKOVÁ, Nikola January 2015 (has links) This diploma thesis deals with the analysis of the comic book´s texts. The objective of this thesis is the analysis of relationships in this kind of texts, with a special focus on its illustrational constituent part and on the description of several types of anaphora. As a model to the elaboration of this thesis were a few of comic books Astérix, published in French language. The thesis is divided into a theoretical and an analytical part. The theoretical part contains three different chapters. First of them deals with the description of comic books as an individual art, the second one describe the comic books Astérix and the third one is dedicated to the linguistic terms, which are closely connected to this subject. All referential relationships between the referential parts of some of comic books Astérix are demonstrated in the analytical part.
17	Paralelismo e foco estrutural no processamento da correferência de pronomes e de nomes repetidos Lima, Juciane Nóbrega 25 March 2014 (has links) Submitted by Maike Costa (maiksebas@gmail.com) on 2016-07-21T13:47:59Z No. of bitstreams: 1 arquivo total.pdf: 1030830 bytes, checksum: affa607bacbace5e6fc95c51c5a74efe (MD5) / Made available in DSpace on 2016-07-21T13:48:00Z (GMT). No. of bitstreams: 1 arquivo total.pdf: 1030830 bytes, checksum: affa607bacbace5e6fc95c51c5a74efe (MD5) Previous issue date: 2014-03-25 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / The present study has aims to investigate the intrasentencial coreference processing, observing how the processing of coreference of pronouns and repeated names occurs in relation to the focus of their antecedents. We take as initial hypothesis that repeated names take an extra cost to be processed than pronouns, despite the antecedent is more salient or not. This is nominated by the Repeated Name Penalty, postulated by the Informational Load Hypothesis. Almor (1999, 2000). We performed two online self-paced reading, through Psyscope program. The dependent variable of the two experiments was the reading time of the critical segment (repeated name or pronoun) and the independent variables were the type of resumption (pronoun or repeated name) and the position of the antecedent (focused or unfocused) . The difference between the two experiments was that at the first we controlled all experimental sentences to contain pronouns and repeated names in the same position and syntactic function of its background, so in parallel . The second was controlled in order not to contain the conditions in parallel. 28 students from UFPB participated in each experiment. The results of the first experiment show a lower reading time for the pronouns in relation to repeated names regardless if its antecedent was focused or not .The structural focus showed no significant effect on any of the experimental conditions. A possible explanation would be that the effect of structural parallelism overlapped the effect of focus. That's what the results of the second experiment showed. The resumption and antecedent not in parallel time resulted in an significant effect of structural focus. The reading time was faster when the antecedent was focused than when it was not. It was also confirmed Repeated Name Penalty also in this second experiment. / O presente estudo tem como objeto de investigação o processamento correferencial intersentencial, procurando observar como se dá o processamento da correferência de pronomes e de nomes repetidos em relação ao foco dos seus respectivos antecedentes. Tomamos como hipótese inicial que nomes repetidos teriam o processamento mais custoso do que os pronomes, independente da saliência do antecedente. Ou seja, haveria Penalidade do Nome Repetido, postulada pela Hipótese da Carga Informacional de Almor (1999; 2000). Para isso, realizamos dois experimentos com uma tarefa on-line de leitura automonitorada (self-paced reading), por meio do programa Psyscope. A variável dependente dos dois experimentos foi o tempo de leitura do segmento crítico (nome repetido ou pronome). E as variáveis independentes foram: o tipo de retomada (pronome ou nome repetido) e a posição do antecedente (focalizado ou não focalizado). A diferença entre os dois experimentos foi que no primeiro controlamos para que em todas as frases experimentais contivessem pronomes e nomes repetidos na mesma posição e função sintática de seus antecedentes, ou seja, em paralelo. Já no segundo controlamos para que em nenhuma das condições tivessem antecedente e retomada em paralelo. O total de participantes voluntários foi de 28 estudantes da UFPB em cada experimento. Os resultados do primeiro experimento mostram menor tempo de leitura para os pronomes em relação aos nomes repetidos independentemente se o seu antecedente estivesse focalizado ou não. Já o foco estrutural não mostrou efeito significativo em nenhuma das condições experimentais. Uma possível explicação seria a de que o efeito do paralelismo estrutural se sobrepôs ao efeito do foco. Foi o que os resultados do segundo experimento demonstraram. Dessa vez, com retomada e antecedente não paralelo, o efeito de foco estrutural se mostrou significativo, ou seja, a leitura foi mais rápida quando o antecedente estava focalizado do que quando não estava. E foi confirmada Penalidade do Nome Repetido também nesse segundo experimento. LINGUISTICA, LETRAS E ARTES::LINGUISTICA
18	Information extraction from pharmaceutical literature Batista-Navarro, Riza Theresa Bautista January 2014 (has links) With the constantly growing amount of biomedical literature, methods for automatically distilling information from unstructured data, collectively known as information extraction, have become indispensable. Whilst most biomedical information extraction efforts in the last decade have focussed on the identification of gene products and interactions between them, the biomedical text mining community has recently extended their scope to capture associations between biomedical and chemical entities with the aim of supporting applications in drug discovery. This thesis is the first comprehensive study focussing on information extraction from pharmaceutical chemistry literature. In this research, we describe our work on (1) recognising names of chemical compounds and drugs, facilitated by the incorporation of domain knowledge; (2) exploring different coreference resolution paradigms in order to recognise co-referring expressions given a full-text article; and (3) defining drug-target interactions as events and distilling them from pharmaceutical chemistry literature using event extraction methods. 005.74
19	A Corpus-Based Comparison Between Coreferential Direct Object Nominal Clauses and Direct Object Infinitive Complements Rutter, Ethan C. 18 April 2022 (has links) (PDF) The objective of this thesis is to analyze the variation between two competing structures--on the one hand, a transitive verb that takes a finite nominal clause as its complement, and on the other hand, a transitive verb that takes an infinitive as its complement. This thesis seeks to address three questions: (1) Which semantic categories are more likely to use coreferential nominal clauses as complements? (2) How do coreferential finite nominal clauses compare with coreferential infinitive complements, in terms of frequency of usage? And (3) do the corpora show any variation among different countries? The corpora CREA and Web/Dialects were used to determine the frequency of usage between these two structures with four different semantic categories of verbs used in the main clause: assertive, dubitative, evaluative, and volitive. The U.S., Spain, Argentina, and Mexico were also used to compare the results by country. The findings show that when the main verb is assertive the use of a subordinate clause is favored, while main clauses with dubitative showed mixed results between the corpora, although Web/Dialects showed that dubitatives favor an infinitive complement. The evaluative verbs lamentar and odiar did not produce any coreferential results with direct object nominal clauses. Volitive verbs never accepted the use of a coreferential finite. The Web/Dialects results indicate that Spain and the U.S were more likely than Argentina and Mexico to use the finite construction after a main clause with a dubitative phrase, while still favoring the infinitive complement. coreference finite verbs infinitive complement corpus linguistics verb mood regional variation Arts and Humanities
20	MEDICAL EVENT TIMELINE GENERATION FROM CLINICAL NARRATIVES Raghavan, Preethi 05 September 2014 (has links) No description available. Computer Science

Search results