This work is an inquiry into automatic summarization of short of fiction. In this dissertation, I present a system that composes summaries of literary short stories employing two types of information: information about entities central to a story and information about the grammatical aspect of clauses. The summaries are tailored to a specific purpose: helping a reader decide whether she would be interested in reading a particular story. They contain just enough information to enable a reader to form adequate expectations about the story, but they do not reveal the plot. According to these criteria, a target summary provides a reader with an idea of whom the story is about, where and when it happens (in a way that goes beyond simply listing names and places) but does not re-tell the events of the story.
In order to build such summaries, the system attempts to identify sentences that meet two criteria: they focus on main entities in the story and they relate the background of the story rather than events. Discussing the criteria for the sentence selection process comprises a large part of this dissertation. These criteria can be roughly divided into two categories: (1) information about main entities (e.g., main characters and locations) and (2) information related to the grammatical aspect of clauses. By relying on this information the system selects sentences that contain important information pertinent to the setting of the story.
Six human judges evaluated the produced summaries in two different ways. Initially, the machine-made summaries were compared against man-made ones. On this account, the summaries rated better than those produced using two naive lead-based baselines. Subsequently, the judges answered a number of questions using the summaries as the only source of information. These answers were compared with the answers made using the complete stories. The summaries appeared to be useful for helping the judges decide whether they would like to read the stories. The judges could also answer simple questions about the setting of the story using the summaries only. The results suggest that aspectual information and information about important entities can be effectively used to build summaries of literary short fiction, even though this information atone is not sufficient for producing high-quality indicative summaries.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/27861 |
Date | January 2007 |
Creators | Kazantseva, Anna |
Publisher | University of Ottawa (Canada) |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | 131 p. |
Page generated in 0.002 seconds