Global ETD Search

Return to search

Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso

The web is a giant resource of data and information about security,
health, education, and others, matters that have great utility for people, but
to get a synthesis or abstract about one or many documents is an expensive
labor, which with manual process might be impossible due to the huge amount
of data. Abstract generation is a challenging task, due to that involves analysis
and comprehension of the written text in non structural natural language dependent
of a context and it must describe an events synthesis or knowledge in
a simple form, becoming natural for any reader. There are diverse approaches
to summarize. These categorized into extractive or abstractive. On abstractive
technique, summaries are generated starting from selecting outstanding sentences
on source text. Abstractive summaries are created by regenerating the
content extracted from source text, through that phrases are reformulated by
terms fusion, compression or suppression processes. In this manner, paraphrasing
sentences are obtained or even sentences were not in the original text. This
summarize type has a major probability to reach coherence and smoothness
like one generated by human beings. The present work implements a method
that allows to integrate syntactic, semantic (AMR annotator) and discursive
(RST) information into a conceptual graph. This will be summarized through
the use of a new measure of concept similarity on WordNet.To find the most
relevant concepts we use PageRank, considering all discursive information given
by the O”Donell method application. With the most important concepts
and semantic roles information got from the PropBank, a natural language
generation method was implemented with tool SimpleNLG.
In this work we can appreciated the results of applying this method to
the corpus of Document Understanding Conference 2002 and tested by Rouge
metric, widely used in the automatic summarization task. Our method reaches
a measure F1 of 24 % in Rouge-1 metric for the mono-document abstract generation
task. This shows that using these techniques are workable and even
more profitable and recommended configurations and useful tools for this task. / Tesis

http://hdl.handle.net/20.500.12404/9361

Computación semántica

Resúmenes

Semántica

Identifer	oai:union.ndltd.org:PUCP/oai:tesis.pucp.edu.pe:20.500.12404/9361
Date	20 September 2017
Creators	Valderrama Vilca, Gregory Cesar
Contributors	Sobrevilla Cabezudo, Marco Antonio
Publisher	Pontificia Universidad Católica del Perú, PE
Source Sets	Pontificia Universidad Católica del Perú
Language	English
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess, http://creativecommons.org/licenses/by-nc-nd/2.5/pe/

Page generated in 0.0023 seconds

Generación automática de resúmenes abstractivos mono documento utilizando análisis semántico y del discurso

Description

Links & Downloads

Tags

Additional Fields