Global ETD Search

Return to search

Clustering Articles in a Literature Digital Library Based on Content and Usage

Literature digital library is one of the most important resources to preserve civilized asset. To provide more effective and efficient information search, many systems are equipped with a browsing interface that aims to ease the article searching task. A browsing interface is associated with a subject directory, which guides the users to identify articles that need their information need. A subject directory contains a set (or a hierarchy) of subject categories, each containing a number of similar articles. How to group articles in a literature digital library is the theme of this thesis.
Previous work used either document classification or document clustering approaches to dispatching articles into a set of article clusters based on their content. We observed that articles that meet a single user¡¦s information need may not necessarily fall in a single cluster. In this thesis, we propose to make use of both Web log and article content is clustering articles. We proposed two hybrid approaches, namely document categorization based method and document clustering based method. These alternatives were compared to other content-based methods. It has been found that the document categorization based method effectively reduces the number of required click-through at the expense of slight increase of entropy that measures the content heterogeneity of each generated cluster.

http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0810104-153712

Digital library

Document categorization

Usage clustering

Document clustering

Content-based clustering

Identifer	oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0810104-153712
Date	10 August 2004
Creators	Ting, Kang-Di
Contributors	Fu-Ren Lin, San-Yih Hwang, Chih-PingWei
Publisher	NSYSU
Source Sets	NSYSU Electronic Thesis and Dissertation Archive
Language	English
Detected Language	English
Type	text
Format	application/pdf
Source	http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0810104-153712
Rights	off_campus_withheld, Copyright information available at source archive

Page generated in 0.0017 seconds

Clustering Articles in a Literature Digital Library Based on Content and Usage

Description

Links & Downloads

Tags

Additional Fields