• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Personalized Document Clustering: Technique Development and Empirical Evaluation

Wu, Chia-Chen 14 August 2003 (has links)
With the proliferation of an electronic commerce and knowledge economy environment, both organizations and individuals generate and consume a large amount of online information, typically available as textual documents. To manage the ever-increasing volume of documents, organizations and individuals typically organize their documents into categories to facilitate document management and subsequent information access and browsing. However, document grouping behaviors are intentional acts, reflecting individuals¡¦ (or organizations¡¦) preferential perspective on semantic coherency or relevant groupings between subjects. Thus, an effective document clustering needs to address the described preferential perspective on document grouping and support personalized document clustering. In this thesis, we designed and implemented a personalized document clustering approach by incorporating individual¡¦s partial clustering into the document clustering process. Combining two document representation methods (i.e., feature refinement and feature weighting) with two clustering processes (i.e., pre-cluster-based and atomic-based), four personalized document clustering techniques are proposed. Using the clustering effectiveness achieved by a traditional content-based document clustering technique as performance benchmarks, our evaluation results suggest that use of partial clusters would improve the document clustering effectiveness. Moreover, the pre-cluster-based technique outperforms the atomic-based one, and the feature weighting method for document representation achieves a higher clustering effectiveness than the feature refinement method does.

Page generated in 0.1496 seconds