Global ETD Search

Return to search

Using document clustering and language modelling in mediated information retrieval

Our work addresses a well documented problem: users are frequently unable to articulate a query that clearly and comprehensively expresses their information need. This can be attributed to the information need being too ambiguous and not clearly defined in the user's mind, to a lack of knowledge of the domain of interest on the part of the user, to a lack of understanding of a retrieval system's conceptual model, or to an inability to use a certain query syntax. This thesis proposes a software tool that emulates the human search mediator. It helps a user explore a domain of interest, learn its structure, terminology and key concepts, and clarify and refine an information need. It can also help a user generate high-quality queries for searching the World Wide Web or other such large and heterogeneous document collections. Our work was inspired by library studies which have highlighted the role of the librarian in helping the user explore her information need, define the problem to be solved, articulate a formulation of the information need and adapt it for the retrieval system at hand in order to get information. Our approach, mediated access through a clustered collection, is based on an information access environment in which the user can explore a relatively small, well structured, pre-clustered document collection covering a particular subject domain, in order to understand the concepts encompassed and to clarify and refine her information need. At the same time, the user can ostensively indicate clusters and documents of interest so that the system builds a model of the user's topic of interest. Based on this model, the system assists and guides the user's exploration, or generates `mediated queries' that can be used to search other collections. We present the design and evaluation of WebCluster, a system that reifies the concept of mediated retrieval. Additionally, a variety of mediation experiments are presented,which provide guidelines as to which mediation strategies are more appropriate for different types of tasks. A set of experiments is presented that evaluate document clustering's capacity to group together topical documents and support mediation. In this context we propose and experimentally test a new formulation for the cluster hypothesis. We also look at the ability of language models to convey content, to represent topics and to highlight specific concepts in a given context. They are also successfully applied to generate flexible, task-dependent cluster representatives for supporting exploration through browsing and respectively searching. Our experimental results show that mediation has potential to significantly improve user queries and consequently the retrieval effectiveness.

http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.391602

005

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:391602
Date	January 2002
Creators	Muresan, Gheorghe
Contributors	Harper, David J.
Publisher	Robert Gordon University
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://hdl.handle.net/10059/623

Page generated in 0.0021 seconds

Using document clustering and language modelling in mediated information retrieval

Description

Links & Downloads

Tags

Additional Fields