Return to search

Folksonomies vs. Bag-of-Words: The Evaluation & Comparison of Different Types of Document Representations

This poster (2-page summary) was presented at The 17th Annual SIG/CR Classification Research Workshop, a part of the 2006 Annual Meeting of the American Society for Information Science and Technology (ASIST), November 4, 2006, Austin, Texas. Among the factors that influence the effectiveness of retrieval systems, the most influential is the quality of document representation (docrep) (Lancaster, 1998). Most Internet search engines rely on docreps automatically extracted from web pages (commonly called Bag-of-Words). Unfortunately, this automatic approach often introduces noise (items unrelated to the pageĆ¢ s core topic) to docreps. One way to reduce noise is to utilize user-created docreps which are less susceptible to it. Until recently, it was impractical to rely on user-created docreps on Internet-size collections. This all changed when online bookmarking web-services such as citeulike.org and del.icio.us started to appear. These bookmarking web-services made it easier for the vast Internet communities to collaborate and produce community-generated descriptors (known as folksonomies). Due to their multi-representational nature (from various community members), folksonomies provide retrieval systems with docreps that tend to be more user-oriented. With this observation in mind, I am investigating whether folksonomies-based retrieval systems would yield more relevant results than conventional systems.

Identiferoai:union.ndltd.org:arizona.edu/oai:arizona.openrepository.com:10150/106052
Date January 2006
CreatorsGruzd, Anatoliy A
Source SetsUniversity of Arizona
LanguageEnglish
Detected LanguageEnglish
TypeConference Poster

Page generated in 0.0027 seconds