1 |
Query Expansion : en jämförande studie av Automatisk Query Expansion med och utan relevans-feedback / Query Expansion : a comparative study of Automatic Query Expansion with and without relevance feedbackEkberg-Selander, Karin, Enberg, Johanna January 2007 (has links)
In query expansion (QE) terms are added to an initial query in order to improve retrieval effectiveness. In this thesis we use QE in the sense that a reformulation of the query is done by deleting the terms in the initial query and instead replacing them with terms from the documents retrieved in the initial run. The aim of this thesis is to, in a experimental full text invironment, study and compare the retrieval result of two different query expansion strategies in relation to each other. The following questions are addressed by the study: How do the two strategies perform in relation to each other regarding recall?What may be causing the result?Are the two strategies retrieving the same relevant documents?Two strategies are designed to simulate a searcher using automatic query expansion (AQE) either with or without relevance feedback. Strategy I is simulating AQE without relevance feedback by taking the top five documents that are retrieved in the initial run and then extracting the top ten most frequently occurring terms in these to create a new query. Correspondingly the Strategy II, is simulating AQE with relevance feedback by taking the top five relevant documents and extracting the top ten terms in these to create a new query. It is concluded that both of the strategies’ retrieval performance was improved for most of the topics. In average Strategy II did achieve 54.63 percent recall compared to Strategy I which did achieve 45.59 percent recall. The two strategies did retrieve different relevant documents for majority of the topics. Hence, it would be reasonable to base a system on both of them. / Uppsatsnivå: D
|
2 |
On the effect of INQUERY term-weighting scheme on query-sensitive similarity measuresKini, Ananth Ullal 12 April 2006 (has links)
Cluster-based information retrieval systems often use a similarity measure to compute the
association among text documents. In this thesis, we focus on a class of similarity
measures named Query-Sensitive Similarity (QSS) measures. Recent studies have shown
QSS measures to positively influence the outcome of a clustering procedure. These
studies have used QSS measures in conjunction with the ltc term-weighting scheme.
Several term-weighting schemes have superseded the ltc term-weighing scheme and
demonstrated better retrieval performance relative to the latter. We test whether
introducing one of these schemes, INQUERY, will offer any benefit over the ltc scheme
when used in the context of QSS measures. The testing procedure uses the Nearest
Neighbor (NN) test to quantify the clustering effectiveness of QSS measures and the
corresponding term-weighting scheme.
The NN tests are applied on certain standard test document collections and the results are
tested for statistical significance. On analyzing results of the NN test relative to those
obtained for the ltc scheme, we find several instances where the INQUERY scheme
improves the clustering effectiveness of QSS measures. To be able to apply the NN test,
we designed a software test framework, Ferret, by complementing the features provided
by dtSearch, a search engine. The test framework automates the generation of NN
coefficients by processing standard test document collection data. We provide an insight
into the construction and working of the Ferret test framework.
|
Page generated in 0.0261 seconds