The formal testing of IR systems involves setting up an experimental corpus (test collection, indexing, test questions etc.) and measuring performance under specific experi~ental conditions. However, the object of any IR test is to make a prediction, extrapolating from the test conditions, about a different situation. Up to now, this extrapolation has been made on the assumption V1at the relative performance of different systems is independent of the situation; but in some recent reported experiments, the assumption has been found not to hold. A more formal theory, indicating how the performance of a system varies according to the conditions, is therefore· needed. Elements of such a theory are proposed and discussed. Particular areas identified as being of concern to such a theory are: the nature of relevance, the various possible forms of the Sw·ets 'control variable', and the psychology of the searching process. The basic quantitative concept used in the theory is the probability, for each question, that the system Vlill retrieve a document of a given de~ree of relevance to that question. Present methods of estimating these probabilities appear to be inadequate; a new method based on a Bayesian approa.ch is developed. The method involves a prior assumption about the distribution of these probabilities over different questions, and makes use of this distribution in estimating them.~ The new method is tested on a variety of sets of data from published tests. The results appear satisfactory, and in some cases suggest new ways of looking at the data. They also indicate that some quantitative aspects of the theory will need modification. Further developments of the theory include: an indication of the possible uses of simulation models, and an application of the theory to the problem of how to make best use of relevance feedback data.
|Creators||Robertson, S. E.|
|Publisher||University College London (University of London)|
|Source Sets||Ethos UK|
|Type||Electronic Thesis or Dissertation|
Page generated in 0.0118 seconds