Return to search

Semantic Distance in WordNet: A Simplified and Improved Measure of Semantic Relatedness

Measures of semantic distance have received a great deal of attention recently in the field of computational lexical semantics. Although techniques for approximating the semantic distance of two concepts have existed for several decades, the introduction of the WordNet lexical database and improvements in corpus analysis have enabled significant improvements in semantic distance measures. <br /><br /> In this study we investigate a special kind of semantic distance, called <em>semantic relatedness</em>. Lexical semantic relatedness measures have proved to be useful for a number of applications, such as word sense disambiguation and real-word spelling error correction. Most relatedness measures rely on the observation that the shortest path between nodes in a semantic network provides a representation of the relationship between two concepts. The strength of relatedness is computed in terms of this path. <br /><br /> This dissertation makes several significant contributions to the study of semantic relatedness. We describe a new measure that calculates semantic relatedness as a function of the shortest path in a semantic network. The proposed measure achieves better results than other standard measures and yet is much simpler than previous models. The proposed measure is shown to achieve a correlation of <em>r</em> = 0. 897 with the judgments of human test subjects using a standard benchmark data set, representing the best performance reported in the literature. We also provide a general formal description for a class of semantic distance measures &mdash; namely, those measures that compute semantic distance from the shortest path in a semantic network. Lastly, we suggest a new methodology for developing path-based semantic distance measures that would limit the possibility of unnecessary complexity in future measures.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OWTU.10012/1016
Date January 2006
CreatorsScriver, Aaron
PublisherUniversity of Waterloo
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeThesis or Dissertation
Formatapplication/pdf, 546817 bytes, application/pdf
RightsCopyright: 2006, Scriver, Aaron. All rights reserved.

Page generated in 0.0018 seconds