Return to search

Automatic indexing of bibliographic information

The use of automatic indexing methods in bibliographic information systems is examined. It is argued that automatic methods do not result in the best set of index terms being assigned to a document representative. A method, Document Learning, is proposed to improve the set of index terms generated by an automatic indexing system. A mathematical model is set up to describe the document learning process. This process uses relevance information associated with queries processed by the information system, in order to alter the index terms associated with the retrieved document representatives. It is essential that the alterations made are not drastic but, rather, that they form part of a process of gradual change. In this manner the document representatives build up new sets of index terms based on a collection of queries. In order to implement this policy of gradual change it is necessary to weight the index terms so that their significance can be increased and decreased. Experimental work carried out to investigate the best form of the learning function is described. The learning function is the mathematical function which defines how document representatives are altered. After selection of a suitable learning function, the document learning process itself was investigated. Pairs of queries were used to see how applying the document learning process on the results obtained from one query would affect the retrieved document set of the second. Then larger sets of queries were used to simulate real users, and the effect on the retrieved document sets noted. It is concluded that the use of document learning can be effective in improving the output of automatic indexing methods, providing the learning function acts in a gradual manner.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:638591
Date January 1981
CreatorsPurgailis, L. M.
PublisherSwansea University
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation

Page generated in 0.0014 seconds