Return to search

Proposition-based summarization with a coherence-driven incremental model

Summarization models which operate on meaning representations of documents have been neglected in the past, although they are a very promising and interesting class of methods for summarization and text understanding. In this thesis, I present one such summarizer, which uses the proposition as its meaning representation. My summarizer is an implementation of Kintsch and van Dijk's model of comprehension, which uses a tree of propositions to represent the working memory. The input document is processed incrementally in iterations. In each iteration, new propositions are connected to the tree under the principle of local coherence, and then a forgetting mechanism is applied so that only a few important propositions are retained in the tree for the next iteration. A summary can be generated using the propositions which are frequently retained. Originally, this model was only played through by hand by its inventors using human-created propositions. In this work, I turned it into a fully automatic model using current NLP technologies. First, I create propositions by obtaining and then transforming a syntactic parse. Second, I have devised algorithms to numerically evaluate alternative ways of adding a new proposition, as well as to predict necessary changes in the tree. Third, I compared different methods of modelling local coherence, including coreference resolution, distributional similarity, and lexical chains. In the first group of experiments, my summarizer realizes summary propositions by sentence extraction. These experiments show that my summarizer outperforms several state-of-the-art summarizers. The second group of experiments concerns abstractive generation from propositions, which is a collaborative project. I have investigated the option of compressing extracted sentences, but generation from propositions has been shown to provide better information packaging.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:763848
Date January 2019
CreatorsFang, Yimai
ContributorsTeufel, Simone
PublisherUniversity of Cambridge
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://www.repository.cam.ac.uk/handle/1810/287468

Page generated in 0.0019 seconds