In this thesis we use a knowledge-based approach to disambiguating prepositional phrase attachments in English sentences. This method was first introduced by S. M. Harabagiu. The Penn Treebank corpus is used as the training text. We extract 4-tuples of the form <em>VP</em>, <em>NP</em><sub>1</sub>, Prep, <em>NP</em><sub>2</sub> and sort them into classes according to the semantic relationships between parts of each tuple. These relationships are extracted from WordNet. Classes are sorted into different tiers based on the strictness of their semantic relationship. Disambiguation of prepositional phrase attachments can be cast as a constraint satisfaction problem, where the tiers of extracted classes act as the constraints. Satisfaction is achieved when the strictest possible tier unanimously indicates one kind of attachment. The most challenging kind of problems for disambiguation of prepositional phrases are ones where the prepositional phrase may attach to either the closest verb or noun. <br /><br /> We first demonstrate that the best approach to extracting tuples from parsed texts is a top-down postorder traversal algorithm. Following that, the various challenges in forming the prepositional classes utilizing WordNet semantic relations are described. We then discuss the actions that need to be taken towards applying the prepositional classes to the disambiguation task. A novel application of this method is also discussed, by which the tuples to be disambiguated are also expanded via WordNet, thus introducing a client-side application of the algorithms utilized to build prepositional classes. Finally, we present results of different variants of our disambiguating algorithm, contrasting the precision and recall of various combinations of constraints, and comparing our algorithm to a baseline method that falls back to attaching a prepositional phrase to the closest left phrase. Our conclusion is that our algorithm provides improved performance compared to the baseline and is therefore a useful new method of performing knowledge-based disambiguation of prepositional phrase attachments.
Identifer | oai:union.ndltd.org:WATERLOO/oai:uwspace.uwaterloo.ca:10012/1051 |
Date | January 2006 |
Creators | Spitzer, Claus |
Publisher | University of Waterloo |
Source Sets | University of Waterloo Electronic Theses Repository |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Format | application/pdf, 363875 bytes, application/pdf |
Rights | Copyright: 2006, Spitzer, Claus. All rights reserved. |
Page generated in 0.0114 seconds