Topic Detection and Tracking (TDT) research has produced some successful statistical tracking systems. While lexical chaining, a non-statistical approach, has also been applied to the task of tracking by Carthy and Stokes for the 2001 TDT evaluation, an efficient tracking system based on this technology has yet to be developed. In thesis we investigate two new techniques which can improve Carthy's original design. First, at the core of our system is a semantic domain chainer. This chainer relies not only on the WordNet database for semantic relationships but also on Magnini's semantic domain database, which is an extension of WordNet. The domain-chaining algorithm is a linear algorithm. Second, to handle proper nouns, we gather all of the ones that occur in a news story together in a chain reserved for proper nouns. In this thesis we also discuss the linguistic limitations of lexical chainers to represent textual meaning.
Identifer | oai:union.ndltd.org:unt.edu/info:ark/67531/metadc4274 |
Date | 08 1900 |
Creators | Yang, Li |
Contributors | Montler, Timothy, Ross, John Robert, 1938-, Mihalcea, Rada, 1974- |
Publisher | University of North Texas |
Source Sets | University of North Texas |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Format | Text |
Rights | Public, Copyright, Yang, Li, Copyright is held by the author, unless otherwise noted. All rights reserved. |
Page generated in 0.002 seconds