Return to search

An exploratory study of human clustering of Web pages

This study seeks to find out how human beings cluster Web pages naturally. 20 Web pages retrieved by the Northern Light search engine for each of 10 queries were sorted by 3 subjects into categories that were natural or meaningful to them. It was found that different subjects clustered the same set of Web pages quite differently and created different categories. The average inter-subject similarity of the clusters created was a low 0.27. Subjects created an average of 5.4 clusters for each sorting. The categories constructed can be divided into 10 types. About 1/3 of the categories created were topical. Another 20% of the categories relate to the degree of relevance or usefulness. The rest of the categories were subject-independent categories such as format, purpose, authoritativeness and direction to other sources. The authors plan to develop automatic methods for categorizing Web pages using the common categories created by the subjects. It is hoped that the techniques developed can be used by Web search engines to automatically organize Web pages retrieved into categories that are natural to users.

Identiferoai:union.ndltd.org:arizona.edu/oai:arizona.openrepository.com:10150/106057
Date January 2002
CreatorsKhoo, Christopher S.G., Ng, Karen, Ou, Shiyan
ContributorsLópez-Huertas, Marí­a J.
PublisherErgon-Verlag
Source SetsUniversity of Arizona
LanguageEnglish
Detected LanguageEnglish
TypeConference Paper

Page generated in 0.0021 seconds