• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Personalized and Context-aware Document Clustering

Yang, Chin-Sheng 15 July 2007 (has links)
To manage the ever-increasing volume of documents, organizations and individuals typically organize documents into categories (or category hierarchies) to facilitate their document management and support subsequent document retrieval and access. Document clustering is an intentional act that should reflect individuals¡¦ preferences with regard to the semantic coherency or relevant categorization of documents and should conform to the context of a target task under investigation. Thus, effective document clustering techniques need to take into account a user¡¦s categorization context defined by or relevant to the target task under consideration. However, existing document clustering techniques generally anchor in pure content-based analysis and therefore are not able to facilitate personalized or context-aware document clustering. In response, we design, implement and empirically evaluate three document clustering techniques capable of facilitating personalized or contextual document clustering. First, we extend an existing document clustering technique (specifically, the partial-clustering-based personalized document-clustering (PEC) approach) and propose the Collaborative Filtering¡Vbased personalized document-Clustering (CFC) technique to overcome the problem of small-sized partial clustering encountered by the PEC technique. Particularly, the CFC technique expands the size of a user¡¦s partial clustering based on the partial clusterings of other users with similar categorization preferences. Second, to support contextual document clustering, we design and implement a Context-Aware document-Clustering (CAC) technique by taking into consideration a user¡¦s categorization preference (i.e., a set of anchoring terms) relevant to the context of a target task and a statistical-based thesaurus constructed from the World Wide Web (WWW) via a search engine. Third, in response to the problem of small-sized set of anchoring terms which can greatly degrade the effectiveness of the CAC technique, we extend CAC and propose a Collaborative Filtering-based Context-Aware document Clustering (CF-CAC) technique. Our empirical evaluation results suggest that our proposed CFC, CAC, and CF-CAC techniques better support the need of personalized and contextual document clustering than do their benchmark techniques.

Page generated in 0.1244 seconds