Global ETD Search

Return to search

On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures

Cluster-based information retrieval systems often use a similarity measure to compute the
association among text documents. In this thesis, we focus on a class of similarity
measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown
QSS measures to positively influence the outcome of a clustering procedure. These
studies have used QSS measures in conjunction with the ltc term-weighting scheme.
Several term-weighting schemes have superseded the ltc term-weighing scheme and
demonstrated better retrieval performance relative to the latter. We test whether
introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme
when used in the context of QSS measures. The testing procedure uses the Nearest
Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the
corresponding term-weighting scheme.
The NN tests are applied on certain standard test document collections and the results are
tested for statistical significance. On analyzing results of the NN test relative to those
obtained for the ltc scheme, we find several instances where the INQUERY scheme
improves the clustering effectiveness of QSS measures. To be able to apply the NN test,
we designed a software test framework, Ferret, by complementing the features provided
by dtSearch, a search engine. The test framework automates the generation of NN
coefficients by processing standard test document collection data. We provide an insight
into the construction and working of the Ferret test framework.

http://hdl.handle.net/1969.1/3116

Information Retrieval

Query-Sensitive Similarity

Identifer	oai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/3116
Date	12 April 2006
Creators	Kini, Ananth Ullal
Contributors	Nelson, Paul
Publisher	Texas A&M University
Source Sets	Texas A and M University
Language	en_US
Detected Language	English
Type	Book, Thesis, Electronic Thesis, text
Format	286564 bytes, electronic, application/pdf, born digital

Page generated in 0.002 seconds

On the effect of INQUERY term-weighting scheme on query-sensitive similarity measures

Description

Links & Downloads

Tags

Additional Fields