Return to search

Learning non-verbal relations under open information extraction paradigm

Made available in DSpace on 2015-04-14T14:50:19Z (GMT). No. of bitstreams: 1
466321.pdf: 1994049 bytes, checksum: fbbeef81814a876679c25f4e015925f5 (MD5)
Previous issue date: 2014-03-12 / O paradigma Open Information Extraction - Open IE (Extra??o Aberta de Informa??es) de extra??o de rela??es trabalha com a identifica??o de rela??es n?o definidas previamente, buscando superar as limita??es impostas pelos m?todos tradicionais de Extra??o de Informa??es como a depend?ncia de dom?nio e a dif?cil escalabilidade. Visando estender o paradigma Open IE para que sejam extra?das rela??es n?o expressas por verbos a partir de textos em ingl?s, apresentamos CompIE, um componente que aprende rela??es expressas em compostos nominais (CNs), como (oil, extracted from, olive) - (?leo, extra?do da, oliva) - do composto nominal olive oil - ?leo de oliva, ou em pares do tipo adjetivo-substantivo (ASs), como (moon, that is, gorgeous) - (lua, que ?, linda) - do AS gorgeous moon (linda lua). A entrada do CompIE ? um arquivo texto, e sua sa?da ? um conjunto de triplas descrevendo rela??es bin?rias. Sua arquitetura ? composta por duas tarefas principais: Extrator de CNs e ASs (1) e Interpretador de CNs e ASs (2). A primeira tarefa gera uma lista de CNs e ASs a partir do corpus de entrada. A segunda tarefa realiza a interpreta??o dos CNs e ASs gerando as triplas que descrevem as rela??es extra?das do corpus. Para estudar a viabilidade da solu??o apresentada, realizamos uma avalia??o baseada em hip?teses. Um prot?tipo foi constru?do com o intuito de validar cada uma das hip?teses. Os resultados obtidos mostram que nossa solu??o alcan?a 89% de Precis?o e demonstram que o CompIE atinge sua meta de estender o paradigma Open IE extraindo rela??es expressas dentro dos CNs e ASs. / The Open Information Extraction (Open IE) is a relation extraction paradigm in which the target relationships cannot be specified in advance, and it aims to overcome the limitations imposed by traditional IE methods, such as domain-dependence and scalability. In order to extend Open IE to extract relationships that are not expressed by verbs from texts in English, we introduce CompIE, a component that learns relations expressed in noun compounds (NCs), such as (oil, extracted from, olive) from olive oil, or in adjectivenoun pairs (ANs), such as (moon, that is, gorgeous) from gorgeous moon. CompIE input is a text file, and the output is a set of triples describing binary relationships. The architecture comprises two main tasks: NCs and ANs Extraction (1) and NCs and ANs Interpretation (2). The first task generates a list of NCs and ANs from the input corpus. The second task performs the interpretation of NCs and ANs and generates the tuples that describe the relations extracted from the corpus. In order to study CompIE s feasibility, we perform an evaluation based on hypotheses. In order to implement the strategies to validate each hypothesis we have built a prototype. The results show that our solution achieves 89% Precision and demonstrate that CompIE reaches its goal of extending Open IE paradigm extracting relationships within NCs and ANs.

Identiferoai:union.ndltd.org:IBICT/oai:tede2.pucrs.br:tede/5275
Date12 March 2014
CreatorsXavier, Clarissa Castell?
ContributorsLima, Vera L?cia Strube de
PublisherPontif?cia Universidade Cat?lica do Rio Grande do Sul, Programa de P?s-Gradua??o em Ci?ncia da Computa??o, PUCRS, BR, Faculdade de Inform?ca
Source SetsIBICT Brazilian ETDs
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/publishedVersion, info:eu-repo/semantics/doctoralThesis
Formatapplication/pdf
Sourcereponame:Biblioteca Digital de Teses e Dissertações da PUC_RS, instname:Pontifícia Universidade Católica do Rio Grande do Sul, instacron:PUC_RS
Rightsinfo:eu-repo/semantics/openAccess
Relation1974996533081274470, 500, 600, 1946639708616176246

Page generated in 0.0067 seconds