• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

An Ontology-Based Personalized Document Clustering Approach

Huang, Tse-hsiu 05 August 2004 (has links)
With the proliferation of electronic commerce and knowledge economy environments, both persons and organizations increasingly have generated and consumed large amounts of online information, typically available as textual documents. To manage this rapid growth of the number of textual documents, people often use categories or folders to organize their documents. These document grouping behaviors are intentional acts that reflect the persons¡¦ (or organizations¡¦) preferences with regard to semantic coherency, or relevant groupings between subjects. For this thesis, we design and implement an ontology-based personalized document clustering (OnPEC) technique by incorporating both an individual user¡¦s partial clustering and an ontology into the document clustering process. Our use of a target user¡¦s partial clustering supports the personalization of document categorization, whereas our use of the ontology turns document clustering from a feature-based to a concept-based approach. In addition, we combine two hierarchical agglomerative clustering (HAC) approaches (i.e., pre-cluster-based and atomic-based) in our proposed OnPEC technique. Using the clustering effectiveness achieved by a traditional content-based document clustering technique and previously proposed feature-based document clustering (PEC) techniques as performance benchmarks, we find that use of partial clusters improves document clustering effectiveness, as measured by cluster precision and cluster recall. Moreover, for both OnPEC and PEC techniques, the clustering effectiveness of pre-cluster-based HAC methods greatly outperforms that of atomic-based HAC methods.

Page generated in 0.1294 seconds