Global ETD Search

Return to search

Um estudo sobre a relevância dos padrões lexicais para a interpretação de textos por meio da extração de informação

Made available in DSpace on 2017-07-10T18:55:26Z (GMT). No. of bitstreams: 1
Lucielen Porfirio.pdf: 522478 bytes, checksum: 120e6f485faab939a4f8ab24bf1f53d1 (MD5)
Previous issue date: 2006-02-17 / Text interpretation is a complex process that depends not only on linguistics aspects, but also cognitive and extra linguistics. In order to interpret a text, any reader must, initially, be able to decode the language and formulate mental representations of the message brought by the text. In order to do so, he will need, necessarily, to make hypothesis and inferences, and activate his previous knowledge, either linguistics or extra linguistics. Besides, the reader must locate the main ideas of the text that are expressed in the lexical items and in the relation among them. In such case, it s reasonable to admit that the identification of isolated terms in a text and the analysis of its real function in it are both very important elements for the text interpretation work. Several methods might be used for working with text interpretation. Among the most common we have the answer to questions (oral or written) about the content of the text, and more recently the Information Extraction (IE). This one is a method that consists, fundamentally, on identification and extraction of relevant linguistic aspects (lexical, syntactic and conceptual semantic) used for different types of objectives, such as: summarization, categorization and text interpretation. Through the location of keywords and linguistics structures the method goal is identify and extract the most important information that together may allow the individual to understand the subject discussed there more easily. Assuming that the interactions among lexical items are one of the most important elements in text interpretation, the goal of this paper is to discuss in what way the reader could better explore this relation, in order to help him to interpret a text. For the analysis three keywords were tracked in a research corpus in the dominium of gastroenterology: intestine , cause and helicobacter pylori . Based on the lexical patterns of collocation, colligation and semantic prosody, these words were investigated, observing how the linguistic relations of each one could reveal meanings and help in interpretation process. As a result, we noticed that through the observation of the lexical patters it was possible to extract information regarding the text subject, as well as important aspects discussed in them, such as diseases, its causes, effects and treatments, even without having access to the whole texts. / A interpretação de textos é um processo complexo por natureza que depende não apenas de aspectos lingüísticos, mas também, cognitivos e extralingüísticos. Para interpretar um texto, todo leitor deve, inicialmente, ser capaz de decodificar o código desse texto e formular as representações mentais sobre o que é trazido como mensagem. Para tanto, o leitor precisará, necessariamente, levantar hipóteses, fazer inferências, e ativar seus conhecimentos prévios, tanto os lingüísticos quanto os de mundo (extralingüísticos). Além disso, o leitor deve localizar as principais idéias contidas num texto, as quais estão expressas nos itens lexicais e nas interações entre eles. Sendo assim, é razoável admitir que a identificação de termos isolados num texto e a análise das suas verdadeiras funções constituem, ambos, elementos de alta relevância para um trabalho de análise interpretativa. Vários métodos têm sido utilizados para se trabalhar a interpretação de textos. Dentre os mais comuns, citamos o exercício de respostas a perguntas (orais ou escritas) e, mais recentemente, a extração de informação. A extração de informação (EI) de textos é um método que consiste, fundamentalmente, na identificação e extração de aspectos lingüísticos relevantes (lexicais, sintáticos e semântico-conceituais), usados para diferentes tipos de finalidades, tais como: a sumarização, a categorização e a interpretação textual. A partir da localização de palavras-chaves e de estruturas lingüísticas, o objetivo do método é não só identificar, mas também extrair do texto as informações importantes que, em conjunto, possam permitir ao indivíduo compreender mais facilmente o assunto ali tratado. Assumindo que as interações entre os itens lexicais são, senão os únicos, um dos elementos mais importantes na interpretação de textos, o objetivo dessa pesquisa é discutir de que maneira essas relações poderiam ser melhor exploradas pelo leitor, para auxiliá-lo no trabalho interpretativo. Para a análise três palavras-chaves foram rastreadas num corpus de pesquisa no domínio da gastroenterologia: intestino , causa e helicobacter pylori . Com base nos padrões lexicais da colocação, coligação e prosódia semântica, as ocorrências de cada uma das palavras foram analisadas, com o intuito de verificar como as relações lingüísticas revelam sentidos e auxiliam no processo interpretativo. Como resultado, observamos que, mesmo sem acesso ao texto como um todo, a partir das ocorrências dos padrões foi possível extrair informações relativas ao assunto dos textos, bem como de aspectos importantes neles discutidos, tais como patologias, suas causas e efeitos.

interpretação de textos

extração de informação

palavra-chave

padrões lexicais

Text interpretation

information extraction

keyword

lexical patterns

Identifer	oai:union.ndltd.org:IBICT/oai:tede.unioeste.br:tede/2324
Date	17 February 2006
Creators	Porfirio, Lucielen
Contributors	Bidarra, Jorge, Benites, Sonia Aparecida Lopes, Sella, Aparecida Feola
Publisher	Universidade Estadual do Oeste do Parana, Programa de Pós-Graduação "Stricto Sensu" em Letras, UNIOESTE, BR, Linguagem e Sociedade
Source Sets	IBICT Brazilian ETDs
Language	Portuguese
Detected Language	English
Type	info:eu-repo/semantics/publishedVersion, info:eu-repo/semantics/masterThesis
Format	application/pdf
Source	reponame:Biblioteca Digital de Teses e Dissertações do UNIOESTE, instname:Universidade Estadual do Oeste do Paraná, instacron:UNIOESTE
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0122 seconds

Um estudo sobre a relevância dos padrões lexicais para a interpretação de textos por meio da extração de informação

Description

Links & Downloads

Tags

Additional Fields