The recent emergence of intelligent agent technology and advances in information gathering have been the important steps forward in efficiently managing and using the vast amount of information now available on the Web to make informed decisions. There are, however, still many problems that need to be overcome in the information gathering research arena to enable the delivery of relevant information required by end users.
Good decisions cannot be made without sufficient, timely, and correct information. Traditionally it is said that knowledge is power, however, nowadays sufficient, timely, and correct information is power. So gathering relevant information to meet user information needs is the crucial step for making good decisions.
The ideal goal of information gathering is to obtain only the information that users need (no more and no less). However, the volume of information available, diversity formats of information, uncertainties of information, and distributed locations of information (e.g. World Wide Web) hinder the process of gathering the right information to meet the user needs. Specifically, two fundamental issues in regard to efficiency of information gathering are mismatch and overload. The mismatch means some information that meets user needs has not been gathered (or missed out), whereas, the overload means some gathered information is not what users need.
Traditional information retrieval has been developed well in the past twenty years. The introduction of the Web has changed people's perceptions of information retrieval. Usually, the task of information retrieval is considered to have the function of leading the user to those documents that are relevant to his/her information needs. The similar function in information retrieval is to filter out the irrelevant documents (or called information filtering). Research into traditional information retrieval has provided many retrieval models and techniques to represent documents and queries. Nowadays, information is becoming highly distributed, and increasingly difficult to gather. On the other hand, people have found a lot of uncertainties that are contained in the user information needs. These motivate the need for research in agent-based information gathering.
Agent-based information systems arise at this moment. In these kinds of systems, intelligent agents will get commitments from their users and act on the users behalf to gather the required information. They can easily retrieve the relevant information from highly distributed uncertain environments because of their merits of intelligent, autonomy and distribution. The current research for agent-based information gathering systems is divided into single agent gathering systems, and multi-agent gathering systems. In both research areas, there are still open problems to be solved so that agent-based information gathering systems can retrieve the uncertain information more effectively from the highly distributed environments.
The aim of this thesis is to research the theoretical framework for intelligent agents to gather information from the Web. This research integrates the areas of information retrieval and intelligent agents. The specific research areas in this thesis are the development of an information filtering model for single agent systems, and the development of a dynamic belief model for information fusion for multi-agent systems. The research results are also supported by the construction of real information gathering agents (e.g., Job Agent) for the Internet to help users to gather useful information stored in Web sites. In such a framework, information gathering agents have abilities to describe (or learn) the user information needs, and act like users to retrieve, filter, and/or fuse the information.
A rough set based information filtering model is developed to address the problem of overload. The new approach allows users to describe their information needs on user concept spaces rather than on document spaces, and it views a user information need as a rough set over the document space. The rough set decision theory is used to classify new documents into three regions: positive region, boundary region, and negative region. Two experiments are presented to verify this model, and it shows that the rough set based model provides an efficient approach to the overload problem.
In this research, a dynamic belief model for information fusion in multi-agent environments is also developed. This model has a polynomial time complexity, and it has been proven that the fusion results are belief (mass) functions. By using this model, a collection fusion algorithm for information gathering agents is presented. The difficult problem for this research is the case where collections may be used by more than one agent. This algorithm, however, uses the technique of cooperation between agents, and provides a solution for this difficult problem in distributed information retrieval systems.
This thesis presents the solutions to the theoretical problems in agent-based information gathering systems, including information filtering models, agent belief modeling, and collection fusions. It also presents solutions to some of the technical problems in agent-based information systems, such as document classification, the architecture for agent-based information gathering systems, and the decision in multiple agent environments. Such kinds of information gathering agents will gather relevant information from highly distributed uncertain environments.
Identifer | oai:union.ndltd.org:ADTP/217141 |
Date | January 2000 |
Creators | Li, Yuefeng, mikewood@deakin.edu.au |
Publisher | Deakin University. School of Computing and Mathematics |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | http://www.deakin.edu.au/disclaimer.html), Copyright Yuefeng Li |
Page generated in 0.0026 seconds