Relevance assessing is an important part of information retrieval (IR) evaluation in addition to being something that all users of IR systems must do as part of their search for relevant documents. In this thesis, we present a user study conducted to understand the relevance judging behaviour of assessors when the prevalence of relevant documents in a set of documents to be judged is varied. In our user study, we collected judgements of participants on document sets of three different prevalence levels. The prevalence levels that we used were low (0.1), balanced (0.5) and high (0.9). We found that participants who judged documents at the 0.9 level made the most mistakes, and participants who judged documents at the 0.5 level made the least mistakes. We did not find a statistically significant difference in judging quality between 0.1 and 0.5 prevalence levels.
Identifer | oai:union.ndltd.org:WATERLOO/oai:uwspace.uwaterloo.ca:10012/6160 |
Date | 23 August 2011 |
Creators | Jethani, Chandra Prakash |
Source Sets | University of Waterloo Electronic Theses Repository |
Language | English |
Detected Language | English |
Type | Thesis or Dissertation |
Page generated in 0.0017 seconds