• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 2
  • 1
  • Tagged with
  • 12
  • 12
  • 5
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Web-assisted anaphora resolution

Li, Yifan 06 1900 (has links)
This dissertation investigates the utility of the web for anaphora resolution. Aside from offering a highly accurate, web-based method for pleonastic it detection, which eliminates up to 4% of errors in pronominal anaphora resolution, it also introduces a web-assisted model for definite description anaphoricity determination and a prototype system of anaphora resolution that uses the web for virtually all subtasks. The thesis starts with a thorough analysis of the relationship between anaphora and definiteness, a study that bridges the gap between previously reported empirical studies of definite description anaphora and the linguistic theories developed around the concept of definiteness. Various naturally-occurring definite descriptions found in the WSJ corpus are analyzed from both perspectives of familiarity and uniqueness, and a new classification scheme for definite descriptions is developed. With the fundamental issues solved, the rest of the thesis focuses on the various ways the web can be exploited for the purpose of anaphora resolution. This thesis presents methods of high-precision, high-recall anaphoricity determination for both pronouns and definite descriptions. Evaluation results suggest that the performance of the pleonastic it identification module is on par with casually-trained human annotators. When used together with a pronominal anaphora resolution system, the module offers a statistically significant performance gain of 4%. The performance of the anaphoricity determination module for definite descriptions, which benefits from both the insight gained from the study on anaphora and definiteness and the significantly expanded coverage offered by the web, is also one of the highest among existing studies. The thesis also introduces a web-centric anaphora resolution system. Aside from serving as the information source for implementing selectional restrictions and discovering hyponym/synonym relationships, the web is additionally used for gender/number determination and many other auxiliary tasks, such as determining the semantic subjects of as-prepositions, identifying antecedents for certain empty categories, and assigning appropriate labels for proper names using information available from the text itself. With a design that specifically leaves room for the application of verb-argument and genitive co-occurrence statistics, the web-based features provide statistically significant gains to the system's performance. / Software Engineering and Intelligent Systems
2

Web-assisted anaphora resolution

Li, Yifan Unknown Date
No description available.
3

Anaphors in discourse : anaphoric subjects in brazilian portuguese / Les anaphores dans le discours : sujets anaphoriques en portugais brésilien

Correa Soares, Eduardo 15 December 2017 (has links)
La présente thèse porte sur l’utilisation et l’interprétation des sujets nuls et pronominaux en portugais brésilien. Son objectif est de comprendre les facteurs sémantiques et discursifs qui peuvent être pertinents pour le choix entre ces expressions anaphoriques et la façon dont ce choix s’articule avec la théorie générale de la résolution de l’anaphore. Le point de départ a été la recherche sur les sujets nuls et réalisés sous la perspective de la grammaire générative, en particulier la théorie paramétrique. Cette thèse démontre que l’analyse proposée dans cette perspective ne peut rendre compte des données observées. Par exemple, la généralisation sur la « pauvreté » de la morphologie verbale directement liée à l’absence, ou à la fréquence réduite, de sujets nuls est contestée avec les données expérimentales ainsi qu’avec la distribution de la fréquence relative des sujets nuls au sein des personnes discursives dans le corpus. Une explication alternative présentée dans la littérature, à savoir l’importance des caractéristiques sémantiques des antécédents – l’Animacité et le Specificité – semble mieux expliquer la distribution constatée. Cette explication n’est cependant pas suffisante pour comprendre le choix des sujets anaphoriques en brésilien, puisque le nombre relatif de sujets nuls animés et spécifiques est relativement plus élevé que dans les langues à expression obligatoire des sujets. Par conséquent, cette thèse soutient que les facteurs discursifs semblent jouer un rôle crucial dans l’utilisation des sujets nuls et réalisés en brésilien. Les principaux facteurs identifiés ici sont le statut évident de l’antécédent et le caractère contrastif de l’information d’arrière-plan et l’information nouvelle. Le premier est un facteur standard dans la littérature sur la résolution de l’anaphore (exprimé par différents termes comme l’accessibilité, la familiarité, etc.), qui permet l’hypothèse d’une relation inverse entre le degré de saillance de l’antécédent et degré explicitation nécessaire dans l’expression anaphorique : plus l’antécédent est saillant, moins l’anaphore doit être explicite. Le second facteur, le contraste, constitue la principale contribution nouvelle de cette thèse : Comme pour d’autres niveaux d’analyse linguistique et d’autres phénomènes dans le langage, le choix de l’expression anaphorique semble être orienté vers l’efficacité. Plus précisément, lorsque l’information d’arrière-plan (background) et l’information assertée (focalisée) dans un énoncé contrastent, il est plus probable qu’un sujet nul soit utilisé. Les caractéristiques d’une grammaire permettant de traiter ces diverses caractéristiques est esquissée : on propose une grammaire à plusieurs niveaux dont les contraintes sémantiques et discursives agissent en parallèle à travers un principe de correspondance probabiliste. Il est ainsi démontré que les sujets nuls sont probables dans certains contextes de co-référence discursifs, puisque dans ces contextes, leurs antécédents sont plus évidents et contrastent plus avec l’information d’arrière-plan. Une contre-preuve apparente à la proposition esquissée ici est analysée : l’interprétation générique des sujets nuls. Cependant, on montre que les mêmes contraintes sémantiques appliquées à d’autres constructions génériques dans plusieurs langues peuvent produire des sujets nuls génériques en brésilien, étant donné l’échec de la mise en arrière-plan prédite par l’approche proposée ici. Enfin, les résultats de trois expériences de mouvements oculaires en lecture, qui étudient l’utilisation et l’interprétation des sujets nuls et pronominaux, sont présentés. Ces résultats corroborent de façon convaincante l’hypothèse selon laquelle les sujets nuls et réalisés ainsi que leur interprétation peuvent être expliqués par la théorie proposée ici, qui les traite en termes de contraintes d’interprétation plutôt qu’en termes de légitimation syntaxique. / The present dissertation is concerned with the use and interpretation of null and pronominal subjects in Brazilian Portuguese. This investigation examines these phenomena in an attempt to disentangle the semantic and discursive factors that can be relevant for choice between these anaphoric expressions in Brazilian Portuguese and the way in which this choice is articulated with the general theory of anaphora resolution. The starting point of this dissertation was the research looking into null and overt subjects from the perspective of Generative Grammar, specially the Parametric Theory. Throughout the present work, however, the analyses proposed in this perspective were shown not to account for the data at stake. The generalization that poor verbal morphology is directly related to the absence or reduced frequency of null subjects, for example, is challenged through experimental data and an investigation of the relative frequency of null subjects across discourse persons in corpora. An alternative explanation presented in the previous literature, namely the importance of the antecedents’ features of Animacy and Specificity, seems to better account for the attested distribution. However, this explanation is not sufficient for understanding the choice between null and overt subjects in Brazilian Portuguese, since the number of animate and specific null subjects is still relatively higher than in languages with obligatory expression of subjects. Therefore, it is argued that discourse factors seem to play a crucial role in the use of null and overt subjects in Brazilian Portuguese. The main factors identified here are Obviousness and Contrast. The first is a standard feature in the literature about anaphora resolution (expressed by a variety of terms, such as Salience, Familiarity, Accessibility, etc.), which is part of the reverse mapping hypothesis according to which the more obvious the subject is, the less explicit the co-referential form is allowed to be. The second factor, Contrast, is the main finding of the present dissertation: as is the case for other levels of linguistic analyses and other phenomena in language, the choice of anaphoric expression in Brazilian Portuguese seems to be driven by efficiency. In the present case, this means that, when the backgrounded information and the asserted (focused) in- formation in an utterance contrast the most, it is more likely that a null subject will be used. The design of a grammar that deals with these multiple features is sketched, specifically, a multi-layered scalar probabilistic grammar is proposed, whose semantic and discourse constraints act in parallel through a probabilistic mapping. It is, thus, shown that null subjects are likely in discursive co- reference, since in these contexts their antecedents are more obvious and the focused information contrasts the most with the background. An apparent counter-example to the proposal sketched here is analyzed: the generic interpretation of null subjects. However, it is shown that the same semantic constraints cross-linguistically applied to other generic constructions can produce generic null subjects in Brazilian Portuguese, given the failure to be grounded predicted by the approach proposed here. Finally, on-line evidence for the analysis of the use and interpretation of null and pronominal subjects is provided. The results found in three eye-tracking while reading experiments provide striking evidence in favor of the proposal put forward here, according to which null and overt subjects and their interpretation can be accounted for in terms of constraints on interpretation rather than licensing.
4

Anaphoric preferences of null and overt subjects in Italian and Spanish : a cross-linguistic comparison

Filiaci, Francesca January 2011 (has links)
This thesis focuses on the cross-linguistic differences between Italian and Spanish regarding the pragmatic restrictions on the resolution of null and overt subject pronouns (NS and OSP). It also tries to identify possible links between such cross-linguistic differences and morpho-syntactic differences at the level of the verbal morphology of the two languages. Spanish and Italian are typologically related and morpho-syntactically similar and have been assumed to instantiate the same setting of the NS parameter with respect to not only its syntactic licensing conditions, but also the pragmatic constraints determining the distribution of null and overt subject pronouns, and this assumption has had important implications for cross-linguistic research. The first aim of this study was to test directly for the first time the assumption about the equivalence of Italian and Spanish; in order to do so, I run a series of self-paced reading experiments using the same materials translated in each language, so that the results were directly comparable. The experiments were based on Carminati’s (2002) study on antecedent preferences for Italian NSs and OSPs in intra-sentential anaphora, testing the Position of Antecedent Strategy. The results suggest that while in Italian there is a strict division of labour between NS and OSP (confirming Carminati’s findings), this division is not as clear-cut in Spanish. More precisely, while Italian personal pronouns unambiguously signal a switch in subject reference, the association between OSPs and switch reference seems to be much weaker in Spanish. These results, which are interpreted in terms of Cardinaletti and Starke’s (1999) cross-linguistic typology of deficient pronouns, highlight an asymmetry between the strength of NS and OSP biases in Spanish that could not have emerged through the traditional methodology used by the numerous variationist studies on the subject, based on corpus analysis. A subsequent pair of experiments tested the hypothesis that the cross-linguistic differences attested might be related to the relative syncretism of the Spanish verbal morphology compared to the Italian one with regard to the unambiguous expression of person features on the verbal head. The results only provided weak support for the hypothesis, although they did confirm the presence of the cross- linguistic differences in the processing and resolution of anaphoric NS and OSP dependencies revealed by the previous experiments.
5

An Experimental Study On Abstract Anaphora Resolution In Turkish Written Discourse

Ergin Somer, Rabiye 01 September 2012 (has links) (PDF)
This thesis provides an experimental approach to abstract anaphora resolution in Turkish written discourse. The core of this work consists of identifying various manifestations of abstract anaphoric expressions &ndash / bu vs. bu durum, bu olay, bu is, bu ger&ccedil / ek (bu as the bare abstract object anaphor vs. bu+label abstract anaphors)- in Turkish discourse, and investigating whether any difference is observed in their processing. To this end, two offline experiments are conducted with human subjects, and the results indicate that label anaphors, compared to the bare anaphor bu, have a tendency to disambiguate the antecendent in some cases.
6

An Experimental Study On Abstract Anaphora Resolution In Turkish Written Discourse

Ergin Somer, Rabiye 01 September 2012 (has links) (PDF)
This thesis provides an experimental approach to abstract anaphora resolution in Turkish written discourse. The core of this work consists of identifying various manifestations of abstract anaphoric expressions &ndash / bu vs. bu durum, bu olay, bu is, bu ger&ccedil / ek (bu as the bare abstract object anaphor vs. bu+label abstract anaphors) - in Turkish discourse, and investigating whether any difference is observed in their processing. To this end, two offline experiments are conducted with human subjects, and the results indicate that label anaphors, compared to the bare anaphor bu, have a tendency to disambiguate the antecendent in some cases.
7

A Knowledge-poor Pronoun Resolution System For Turkish

Kucuk, Dilek 01 September 2005 (has links) (PDF)
This thesis presents a knowledge-poor pronoun resolution system for Turkish which resolves third person personal pronouns and possessive pronouns. The system is knowledge-poor in the sense that it makes use of limited linguistic and semantic knowledge to resolve the pronouns. As pronoun resolution proposals for languages like English, French and Spanish, the core of the system is the constraints and preferences which are determined empirically. The system has four modules: sentence splitting, pronoun extraction, forming the list of candidate antecedents and determination of the antecedent. It takes a Turkish text as input and rewrites this text with the considered pronouns replaced with their proposed antecedents. In order to compare the success rate of the system, two different baseline algorithms are implemented. The original system is tested against these baseline algorithms on two sample Turkish texts from different sources. Some suggestions to improve the success rate of the system and to extend the domain of the system are also presented.
8

Spanish and Greek subjects in contact : Greek as a heritage language in Chile

Giannakou, Aretousa January 2018 (has links)
The present study aims to capture linguistic variation in subject distribution of two typologically similar languages, Greek and Chilean Spanish, considering adult monolingual and bilingual speakers of Greek as a heritage/minority language in Chile. The focus is on null and overt third-person subjects in topic-continuity and topic-shift contexts. Such structures involve the interface between syntax and discourse/pragmatics, a vulnerable domain in bilingualism. Previous research has shown overextension of the scope of the overt subject pronoun in contexts where null subjects are discursively expected (e.g. Tsimpli, Sorace, Heycock & Filiaci 2004). The Interface Hypothesis (IH) (Sorace 2011) was formulated to account for such findings, which obtain even in pairs of two null subject languages (Sorace, Serratrice, Filiaci & Baldo 2009). The key question as to the language-contact effects on subject distribution in pairs of two null subject languages requires further exploration while the combination of Greek and Spanish has been so far understudied. The IH is evaluated with new empirical data from a bilingual situation not studied before. Data from oral narratives and aural pronominal anaphora resolution were elicited from monolinguals and three types of bilinguals, namely first-generation immigrants, heritage speakers and L2 speakers of Greek residing in Chile. The monolingual data revealed differences in the use and interpretation of overt subject pronouns between Greek and Chilean Spanish. The crosslinguistic difference lies in the strong deictic properties of the Greek pronoun compared to its Spanish counterpart; hence differences obtain because of the relative strength of the two pronominal forms. No overextension of the scope of overt pronouns was found in bilinguals, against predictions stemming from the Interface Hypothesis. This may relate to the typological similarity between Greek and Spanish as well as to the nature of the Greek pronoun, which makes its use relatively categorical. Such findings lend support to the Representational account (Tsimpli et al. 2004). On the contrary, null subjects gave rise to optionality presumably due to their complexity, which demands higher degrees of computational efficiency. The Vulnerability Hypothesis (Prada Pérez 2018) may also account for the findings.
9

Discourse processing abilities in ageing : influence of working memory capacity on reference resolution.

Ghaleh, Maryam January 2015 (has links)
Maintaining health and quality of life into old age is a critical issue facing society today. Language, and in particular language comprehension, is vulnerable to the processes of ageing (Au, Albert, & Obler, 1989; Kynette & Kemper, 1986; Nicholas, Obler, Albert, & Goodglass, 1985; Shewan & Henderson, 1988). An improved understanding of language processing and ageing will assist in distinguishing language difficulties in normal ageing from those in pathological ageing and aphasia (Maxim & Bryan, 1994) and, potentially, optimises communication throughout life. The current thesis focuses on a specific component of language comprehension - anaphora resolution . Anaphora resolution occurs frequently in everyday discourse and has been reported to decline with ageing (Cohen, 1979; Light & Capps, 1986; Ulatowska, Hayashi, Cannito, & Fleming, 1986). This thesis explored anaphora resolution relative to two key variables: ageing and working memory. Ageing was chosen as a variable as anaphora resolution has been shown to be affected by age (Cohen, 1979; Light & Capps, 1986; Ulatowska et al., 1986). Working memory was chosen as working memory is thought to underlie key aspects of discourse comprehension such as building a mental structure of discourse and updating the information (Brébion, 2003; Hasher & Zacks, 1988; Radvansky, Copeland, & Hippel, 2010; Radvansky, Lynchard, & von Hippel, 2009). Anaphora resolution was investigated using two key paradigms. The first focussed on anaphora resolution in a reading comprehension task. Performance was assessed using accuracy of response. The second employed Gernsbacher's (1989) probe-response paradigm. The probe- response paradigm allowed examination of specific working memory processes underlying discourse comprehension, namely; a) storing and maintaining information in working memory (i.e., laying the foundation of the discourse structure); and b) updating information stored in working memory through suppressing the irrelevant discourse information. Storage and maintenance of the information was assessed by examining whether participants utilised “advantage of first mention” (Gernsbacher, 1990). Suppression was evaluated by investigating whether the accessibility of nonreferent names decreased in participants' working memory after they read anaphoric pronouns in sentences. This approach aimed to answer the following questions: 1) Do age and working memory capacity affect anaphora resolution in a comprehension task?; 2) Do age and working memory affect advantage of first mention in a probe recognition task?; and 3) Does age affect suppression of irrelevant information in an anaphora resolution task? In Chapter 3, Gernsbacher's (1989) original study was replicated. In Chapter 4 the same questions were examined, with the addition of a higher working memory load. For both studies, 30 younger and 30 older participants completed two comprehension experiments followed by an assessment of working memory capacity (reading span task). The comprehension experiments each contained a reading comprehension task and a probe recognition task. The reading comprehension task introduced two discourse characters (either a male or female name), one of which was referred to later in the text, using an anaphoric pronoun. Comprehension questions always asked about the referents of the anaphoric pronouns. Participants' accuracy in answering each comprehension question was indicative of their ability to resolve anaphora. Response times in the recognition task provided measures of the accessibility of: a) first and second mentioned names, and b) referent and nonreferent names. Chapters 3 and 4 found that, regardless of the tasks' working memory storage demands, older adults were less accurate than younger adults in the comprehension of anaphoric pronouns. Comprehension accuracy was related to working memory capacity, such that individuals with higher working memory capacity exhibited higher accuracy of response in the comprehension task. In addition, working memory capacity affected the accessibility of first and second mentioned names in the discourse suggesting that working memory capacity might influence the process of laying the foundation for the mental representation of comprehension. An ageing effect was observed on the suppression process during anaphora resolution under high working memory load only. When working memory load was low, neither younger nor older participants suppressed the accessibility of the nonreferents by the time they finished reading the sentences. This suggested that anaphora resolution might be postponed in less demanding tasks. However, under higher working memory load, younger adults, but not older adults, suppressed the accessibility of the nonreferents by the time they finished reading the sentence. It was therefore suggested that age-related changes in anaphora resolution abilities might be mediated by a decline in inhibitory functions that are responsible for suppressing the already-activated information that are no longer relevant to the task goals. The final study of the thesis (Chapter 5) aimed to determine why younger adults delayed the process of anaphora resolution in Experiment 1 (See Chapter 3), but completed the process by the time they finished reading the sentences in Experiment 2 (See Chapter 4). Specific questions addressed were: 1) Was comprehension accuracy affected by working memory storage load and the syntactic structure of the sentences?; 2) Do younger adults suppress the accessibility of the nonreferents by the time they reach the end of the sentence, in simpler sentences with increased storage load and late disambiguation?; and, 3) Do younger adults suppress the accessibility of nonreferents by the time they reach the end of the sentence, in more syntactically complex sentences with low storage load and prior disambiguation?. Forty younger participants completed four separate comprehension experimental tasks followed by a reading span test. A similar experimental approach was employed to that described in Chapters 3 and 4; however working memory storage load, syntactic complexity, and time-course for providing contextual information were manipulated. Results of Chapter 5 found that participants' accuracy declined in more syntactically complex sentences. A decline in accuracy appeared indicative of the tasks' higher processing demands and demonstrated that prior disambiguation was not facilitating the resolution of anaphora. Results from the recognition task showed that in sentences of increased syntactic complexity, participants suppressed the accessibility of nonreferents by the time they finished reading the sentence. It was suggested that higher processing demands of syntactically complex sentences, rather than a facilitating effect of earlier disambiguation in these sentences, contributed to the earlier suppression of nonreferents. In summary, this thesis demonstrated that older adults were less accurate than younger adults in comprehending anaphoric pronouns. Moreover, working memory capacity positively influenced comprehension accuracy and affected the advantage of first mention of discourse entities. It was suggested that individual differences in working memory capacity might affect the ability to lay foundations for discourse comprehension. Furthermore, older adults showed no suppression of nonreferents during processing of anaphora, regardless of working memory storage load. It appears possible that older adults' difficulty in anaphora resolution might be due to an inability to suppress irrelevant discourse information. Findings from the present study suggest that ageing may negatively affect the comprehension of linguistic structures for which more than one meaning could be inferred. While further exploration of this finding is required, it is possible that communication strategies could be devised to minimise the use of structures with more than one meaning - with the aim of improving and maintaining communication in older adults. Ultimately, determining the underlying causes of language impairments in both healthy ageing and neurological disease will help to improve speech-language therapy methods for these populations.
10

Detecção de opiniões e análise de polaridade em documentos financeiros com múltiplas entidades.

Silva, Josiane Rodrigues da 05 March 2015 (has links)
Submitted by Kamila Costa (kamilavasconceloscosta@gmail.com) on 2015-06-11T19:07:15Z No. of bitstreams: 1 Dissertação-Josiane R da Silva.pdf: 885828 bytes, checksum: c1b7c04345db68585285d33349f9677f (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-06-15T18:03:40Z (GMT) No. of bitstreams: 1 Dissertação-Josiane R da Silva.pdf: 885828 bytes, checksum: c1b7c04345db68585285d33349f9677f (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-06-15T18:06:18Z (GMT) No. of bitstreams: 1 Dissertação-Josiane R da Silva.pdf: 885828 bytes, checksum: c1b7c04345db68585285d33349f9677f (MD5) / Made available in DSpace on 2015-06-15T18:06:18Z (GMT). No. of bitstreams: 1 Dissertação-Josiane R da Silva.pdf: 885828 bytes, checksum: c1b7c04345db68585285d33349f9677f (MD5) Previous issue date: 2015-03-05 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Polarity analysis aims at classifying the author’s opinion into positive, negative, or neutral. However, given the sheer volume of information available on the web, manually carrying out such task is unfeasible. In particular, in the financial domain this type of analysis is useful for companies in making decisions related to the financial market which is particularly prone to changes according to shifting of opinions. Most studies in literature deal with this problem by considering that documents have a global polarity. However, in general, documents cite several entities with possibly different polarities. This suggests that the classification should be performed in an entity level. Besides this problem, we also noted that many financial documents do not always emit opinion. Thus, a first task of interest in this research field is to identify documents on which opinions are expressed, that is, the subjective ones. Therefore, in this paper we propose a supervised polarity classification method based on multiple models to deal with financial documents with multiple entities. In particular, we study text segmentation strategies that use heuristics such as string matching and anaphora resolution and we propose a hierarchical classification method based on subjectivity detection. Our results showed that the multiple-models approach significantly outperformed the global-model baseline. The segmentation of the documents restricted to sentences that mention entities and the adoption of a hierarchical strategy also achieved gains, although modest. / Análise de polaridade consiste em classificar a opinião do autor em positiva, negativa e neutra. No entanto, dado o grande volume de informações disponíveis na Web, esta análise manual torna-se inviável. Em particular, no domínio financeiro este tipo de análise é útil para empresas na tomada de decisões relacionadas ao mercado financeiro que parece ser particularmente propenso a mudanças de acordo com opiniões. Os trabalhos disponíveis na literatura propõem abordagens globais para esta tarefa, ou seja, consideram que o texto tem apenas uma polaridade. No entanto, verifica-se que os documentos, em sua grande maioria, citam várias entidades e as polaridades para estas entidades, em geral, são diferentes. Isto sugere que a classificação de polaridade deve ser feita em nível de entidade. Contudo, a maioria das abordagens tradicionais não concentram-se na tarefa de classificar polaridade por entidade. Além disso, observamos que muitos dos documentos no domínio financeiro nem sempre emitem opinião. Assim, uma primeira tarefa de interesse nesse domínio é identificar os documentos em que opiniões são expressas, isto é, documentos subjetivos. Portanto, neste trabalho propomos um método supervisionado para classificação de polaridade baseado em múltiplos modelos com o intuito de classificar documentos financeiros com múltiplas entidades. Em particular, estudamos estratégias de segmentação em texto que usam heurísticas de casamento de string e resolução de anáfora e propomos um método de classificação hierárquica baseada em detecção de subjetividade. Nossos resultados mostraram que uma abordagem baseada em múltiplos modelos é capaz de obter ganhos significativos sobre uma abordagem baseada em modelo global na tarefa de classificação de polaridade com múltiplas entidades. A segmentação do documento em sentenças que mencionam as entidades e a adoção de uma estratégia hierárquica também obtiveram ganhos, embora modestos.

Page generated in 0.0733 seconds