Return to search

A Novel Concept and Context-Based Approach for Web Information Retrieval

Web information retrieval is a relatively new research area that has attracted a significant amount of interest from researchers around the world since the emergence of the World Wide Web in the early 1990s. The problems facing successful web information retrieval are a combination of challenges that stem from traditional information retrieval and challenges characterised by the nature of the World Wide Web. The goal of any information retrieval system is to provide an information need fulfilment in response to an information need. In a web setting, this means retrieving as many relevant web documents as possible in response to an inputted query that is typically limited to only containing a few terms expressive of the user's information need. This thesis is primarily concerned with firstly reviewing pertinent literature related to various aspects of web information retrieval research and secondly proposing and investigating a novel concept and context-based approach. The approach consists of techniques that can be used together or independently and aim to provide an improvement in retrieval accuracy over other approaches. A novel concept-based term weighting technique is proposed as a new method of deriving query term significance from ontologies that can be used for the weighting of inputted queries. A technique that dynamically determines the significance of terms occurring in documents based on the matching of contexts is also proposed. Other contributions of this research include techniques for the combination of document and query term weights for the ranking of retrieved documents. All techniques were implemented and tested on benchmark data. This provides a basis for performing comparison with previous top performing web information retrieval systems. High retrieval accuracy is reported as a result of utilising the proposed approach. This is supported through comprehensive experimental evidence and favourable comparisons against previously published results.

Identiferoai:union.ndltd.org:ADTP/195606
Date January 2005
CreatorsZakos, John, n/a
PublisherGriffith University. School of Information and Communication Technology
Source SetsAustraliasian Digital Theses Program
LanguageEnglish
Detected LanguageEnglish
Rightshttp://www.gu.edu.au/disclaimer.html), Copyright John Zakos

Page generated in 0.0022 seconds