There is a vast amount of information available with the aid of computers. It is now far easier to make information available on a CD-ROM or on the Internet than it is to find specific information to fill someone's need. To expect all users to be experts in negotiating the vast amount of available data is unrealistic. Information retrieval systems are designed to help users sort through this sea of text and find the documents that best meet their needs. Information retrieval systems search for documents that match a user's information need based on some user-supplied representation of that need. One important consideration is that the naive users, the ones who most need help, are unlikely to be able to express their need in the best possible way. The specification of the user's query is a difficult task for the user to do well and for the system to understand completely. One important source of information about the user's need is a collection of example documents that illustrate how the user's need can be met. These documents not only provide more information than the user could possibly specify directly, they are also often possible to obtain at a low cost. In this dissertation, a probabilistic theory of how to utilize information available in example documents to automatically improve a user's query and to thereby improve the effectiveness of the information retrieval system is described. This has been done by extending the inference network model of information retrieval developed by Turtle and Croft (47) by adding the mechanism of annotated inference networks and by providing methods to measure and control the contribution of individual components of a query. The research described here not only provides a sound theoretical understanding of how to extract information from example documents but also suggests methods that lead to practical improvements in performance.
Identifer | oai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-2778 |
Date | 01 January 1996 |
Creators | Haines, David Leon |
Publisher | ScholarWorks@UMass Amherst |
Source Sets | University of Massachusetts, Amherst |
Language | English |
Detected Language | English |
Type | text |
Source | Doctoral Dissertations Available from Proquest |
Page generated in 0.0033 seconds