Return to search

Automated Knowledge Extraction from Archival Documents

Traditional archival media such as paper, film, photographs, etc. contain a vast storage of knowledge. Much of this knowledge is applicable to current business and scientific problems, and offers solutions; consequently, there is value in extracting this information. While it is possible to manually extract the content, this technique is not feasible for large knowledge repositories due to cost and time. In this thesis, we develop a system that can extract such knowledge automatically from large repositories. A Graphical User Interface that permits users to indicate the location of the knowledge components (indexes) is developed, and software features that permit automatic extraction of indexes from similar documents is presented. The indexes and the documents are stored in a persistentdata store.The system is tested on a University Registrar’s legacy paper-based transcript repository. The study shows that the system provides a good solution for large-scale extraction of knowledge from archived paper and other media.

Identiferoai:union.ndltd.org:auctr.edu/oai:digitalcommons.auctr.edu:cauetds-1371
Date31 July 2019
CreatorsMalki, Khalil
PublisherDigitalCommons@Robert W. Woodruff Library, Atlanta University Center
Source SetsAtlanta University Center
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceElectronic Theses & Dissertations Collection for Atlanta University & Clark Atlanta University

Page generated in 0.0088 seconds