Return to search

Search algorithms on structured and unstructured data in a large database

This project is concerned with the development of a search algorithm for a large archival database. The Port Elizabeth Genealogical Information System (PEGIS) contains a database consisting of almost 600000 individuals. The standard search algorithms are no longer sufficient to locate individuals in the database. A new algorithm was required that allows searches on any of the words or dates in the database, as well as a means to specify where in the desired record a word should occur. A ranking function of retrieved records was also required. A literature study on the field of Information Retrieval and on algorithms designed specifically for the PEGIS was done. These algorithms were adapted and hybridized to yield a search algorithm that allows for the boolean formulation of queries and the specification of the structure of search words in the desired records. The algorithm ranks retrieved records in assumed relevance to the user. The new algorithms were evaluated with regards to retrieval speed and accuracy and were found to be very effective.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:nmmu/vital:11094
Date January 2004
CreatorsDu Plessis, Mathys Cornelius
PublisherUniversity of Port Elizabeth, Faculty of Science
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis, Masters, MSc
Format158 p, pdf
RightsNelson Mandela Metropolitan University

Page generated in 0.0016 seconds