Return to search

On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures

Cluster-based information retrieval systems often use a similarity measure to compute the
association among text documents. In this thesis, we focus on a class of similarity
measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown
QSS measures to positively influence the outcome of a clustering procedure. These
studies have used QSS measures in conjunction with the ltc term-weighting scheme.
Several term-weighting schemes have superseded the ltc term-weighing scheme and
demonstrated better retrieval performance relative to the latter. We test whether
introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme
when used in the context of QSS measures. The testing procedure uses the Nearest
Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the
corresponding term-weighting scheme.
The NN tests are applied on certain standard test document collections and the results are
tested for statistical significance. On analyzing results of the NN test relative to those
obtained for the ltc scheme, we find several instances where the INQUERY scheme
improves the clustering effectiveness of QSS measures. To be able to apply the NN test,
we designed a software test framework, Ferret, by complementing the features provided
by dtSearch, a search engine. The test framework automates the generation of NN
coefficients by processing standard test document collection data. We provide an insight
into the construction and working of the Ferret test framework.

Identiferoai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/3116
Date12 April 2006
CreatorsKini, Ananth Ullal
ContributorsNelson, Paul
PublisherTexas A&M University
Source SetsTexas A and M University
Languageen_US
Detected LanguageEnglish
TypeBook, Thesis, Electronic Thesis, text
Format286564 bytes, electronic, application/pdf, born digital

Page generated in 0.002 seconds