• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

WebDoc an Automated Web Document Indexing System

Tang, Bo 13 December 2002 (has links)
This thesis describes WebDoc, an automated system that classifies Web documents according to the Library of Congress classification system. This work is an extension of an early version of the system that successfully generated indexes for journal articles. The unique features of Web documents, as well as how they will affect the design of a classification system, are discussed. We argue that full-text analysis of Web documents is inevitable, and contextual information must be used to assist the classification. The architecture of the WebDoc system is presented. We performed experiments on it with and without the assistance of contextual information. The results show that contextual information improved the system?s performance significantly.

Page generated in 0.0777 seconds