Return to search

CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS

A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking.

Identiferoai:union.ndltd.org:uky.edu/oai:uknowledge.uky.edu:gradschool_theses-1164
Date01 January 2011
CreatorsMurugesan, Keerthiram
PublisherUKnowledge
Source SetsUniversity of Kentucky
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceUniversity of Kentucky Master's Theses

Page generated in 0.1437 seconds